NAG CL Interface
g07dcc (robust_1var_mestim_wgt)
1
Purpose
g07dcc computes an -estimate of location with (optional) simultaneous estimation of scale, where you provide the weight functions.
2
Specification
void |
g07dcc (
double |
(*chi)(double t,
Nag_Comm *comm),
|
|
double |
(*psi)(double t,
Nag_Comm *comm),
|
|
Integer isigma,
Integer n,
const double x[],
double beta,
double *theta,
double *sigma,
Integer maxit,
double tol,
double rs[],
Integer *nit,
Nag_Comm *comm,
NagError *fail) |
|
The function may be called by the names: g07dcc, nag_univar_robust_1var_mestim_wgt or nag_robust_m_estim_1var_usr.
3
Description
The data consists of a sample of size , denoted by , drawn from a random variable .
The
are assumed to be independent with an unknown distribution function of the form,
where
is a location parameter, and
is a scale parameter.
-estimators of
and
are given by the solution to the following system of equations;
where
and
are user-supplied weight functions, and
is a constant. Optionally the second equation can be omitted and the first equation is solved for
using an assigned value of
.
The constant
should be chosen so that
is an unbiased estimator when
, for
has a Normal distribution. To achieve this the value of
is calculated as:
The values of
are known as the Winsorized residuals.
The equations are solved by a simple iterative procedure, suggested by Huber:
and
or
if
is fixed.
The initial values for
and
may be user-supplied or calculated within
g07dbc as the sample median and an estimate of
based on the median absolute deviation respectively.
g07dcc is based upon function LYHALG within the ROBETH library, see
Marazzi (1987).
4
References
Hampel F R, Ronchetti E M, Rousseeuw P J and Stahel W A (1986) Robust Statistics. The Approach Based on Influence Functions Wiley
Huber P J (1981) Robust Statistics Wiley
Marazzi A (1987) Subroutines for robust estimation of location and scale in ROBETH Cah. Rech. Doc. IUMSP, No. 3 ROB 1 Institut Universitaire de Médecine Sociale et Préventive, Lausanne
5
Arguments
-
1:
– function, supplied by the user
External Function
-
chi must return the value of the weight function
for a given value of its argument. The value of
must be non-negative.
The specification of
chi is:
double |
chi (double t,
Nag_Comm *comm)
|
|
-
1:
– double
Input
-
On entry: the argument for which
chi must be evaluated.
-
2:
– Nag_Comm *
Pointer to structure of type Nag_Comm; the following members are relevant to
chi.
- user – double *
- iuser – Integer *
- p – Pointer
The type Pointer will be
void *. Before calling
g07dcc you may allocate memory and initialize these pointers with various quantities for use by
chi when called from
g07dcc (see
Section 3.1.1 in the Introduction to the NAG Library CL Interface).
Note: chi should not return floating-point NaN (Not a Number) or infinity values, since these are not handled by
g07dcc. If your code inadvertently
does return any NaNs or infinities,
g07dcc is likely to produce unexpected results.
-
2:
– function, supplied by the user
External Function
-
psi must return the value of the weight function
for a given value of its argument.
The specification of
psi is:
double |
psi (double t,
Nag_Comm *comm)
|
|
-
1:
– double
Input
-
On entry: the argument for which
psi must be evaluated.
-
2:
– Nag_Comm *
Pointer to structure of type Nag_Comm; the following members are relevant to
psi.
- user – double *
- iuser – Integer *
- p – Pointer
The type Pointer will be
void *. Before calling
g07dcc you may allocate memory and initialize these pointers with various quantities for use by
psi when called from
g07dcc (see
Section 3.1.1 in the Introduction to the NAG Library CL Interface).
Note: psi should not return floating-point NaN (Not a Number) or infinity values, since these are not handled by
g07dcc. If your code inadvertently
does return any NaNs or infinities,
g07dcc is likely to produce unexpected results.
-
3:
– Integer
Input
-
On entry: the value assigned to
isigma determines whether
is to be simultaneously estimated.
- The estimation of is bypassed and sigma is set equal to .
- is estimated simultaneously.
-
4:
– Integer
Input
-
On entry: , the number of observations.
Constraint:
.
-
5:
– const double
Input
-
On entry: the vector of observations, .
-
6:
– double
Input
-
On entry: the value of the constant
of the chosen
chi function.
Constraint:
.
-
7:
– double *
Input/Output
-
On entry: if
,
theta must be set to the required starting value of the estimate of the location parameter
. A reasonable initial value for
will often be the sample mean or median.
On exit: the -estimate of the location parameter .
-
8:
– double *
Input/Output
-
On entry: the role of
sigma depends on the value assigned to
isigma as follows.
If
,
sigma must be assigned a value which determines the values of the starting points for the calculation of
and
. If
,
g07dcc will determine the starting points of
and
. Otherwise, the value assigned to
sigma will be taken as the starting point for
, and
theta must be assigned a relevant value before entry, see above.
If
,
sigma must be assigned a value which determines the values of
, which is held fixed during the iterations, and the starting value for the calculation of
. If
,
g07dcc will determine the value of
as the median absolute deviation adjusted to reduce bias (see
g07dac) and the starting point for
. Otherwise, the value assigned to
sigma will be taken as the value of
and
theta must be assigned a relevant value before entry, see above.
On exit: the
-estimate of the scale parameter
, if
isigma was assigned the value
on entry, otherwise
sigma will contain the initial fixed value
.
-
9:
– Integer
Input
-
On entry: the maximum number of iterations that should be used during the estimation.
Suggested value:
.
Constraint:
.
-
10:
– double
Input
-
On entry: the relative precision for the final estimates. Convergence is assumed when the increments for
theta, and
sigma are less than
.
Constraint:
.
-
11:
– double
Output
-
On exit: the Winsorized residuals.
-
12:
– Integer *
Output
-
On exit: the number of iterations that were used during the estimation.
-
13:
– Nag_Comm *
-
The NAG communication argument (see
Section 3.1.1 in the Introduction to the NAG Library CL Interface).
-
14:
– NagError *
Input/Output
-
The NAG error argument (see
Section 7 in the Introduction to the NAG Library CL Interface).
6
Error Indicators and Warnings
- NE_ALLOC_FAIL
-
Dynamic memory allocation failed.
See
Section 3.1.2 in the Introduction to the NAG Library CL Interface for further information.
- NE_BAD_PARAM
-
On entry, argument had an illegal value.
- NE_FUN_RET_VAL
-
The
chi function returned a negative value:
.
- NE_INT
-
On entry, .
Constraint: or .
On entry, .
Constraint: .
On entry, .
Constraint: .
- NE_INTERNAL_ERROR
-
An internal error has occurred in this function. Check the function call and any array sizes. If the call is correct then please contact
NAG for assistance.
See
Section 7.5 in the Introduction to the NAG Library CL Interface for further information.
- NE_NO_LICENCE
-
Your licence key may have expired or may not have been installed correctly.
See
Section 8 in the Introduction to the NAG Library CL Interface for further information.
- NE_REAL
-
On entry, .
Constraint: .
On entry, .
Constraint: .
- NE_REAL_ARRAY_ELEM_CONS
-
All elements of
x are equal.
- NE_SIGMA_NEGATIVE
-
Current estimate of
sigma is zero or negative:
. This error exit is very unlikely, although it may be caused by too large an initial value of
sigma.
- NE_TOO_MANY_ITER
-
Number of iterations required exceeds
maxit:
.
- NE_ZERO_RESID
-
All winsorized residuals are zero. This may occur when using the option with a redescending function, i.e., Hampel's piecewise linear function, Andrew's sine wave, and Tukey's biweight.
If the given value of
is too small, the standardized residuals
, will be large and all the residuals may fall into the region for which
. This may incorrectly terminate the iterations thus making
theta and
sigma invalid.
Re-enter the function with a larger value of or with .
7
Accuracy
On successful exit the accuracy of the results is related to the value of
tol, see
Section 5.
8
Parallelism and Performance
g07dcc is threaded by NAG for parallel execution in multithreaded implementations of the NAG Library.
g07dcc makes calls to BLAS and/or LAPACK routines, which may be threaded within the vendor library used by this implementation. Consult the documentation for the vendor library for further information.
Please consult the
X06 Chapter Introduction for information on how to control and interrogate the OpenMP environment used within this function. Please also consult the
Users' Note for your implementation for any additional implementation-specific information.
Standard forms of the functions
and
are given in
Hampel et al. (1986),
Huber (1981) and
Marazzi (1987).
g07dbc calculates
-estimates using some standard forms for
and
.
When you supply the initial values, care has to be taken over the choice of the initial value of
. If too small a value is chosen then initial values of the standardized residuals
will be large. If the redescending
functions are used, i.e.,
if
, for some positive constant
, then these large values are Winsorized as zero. If a sufficient number of the residuals fall into this category then a false solution may be returned, see page 152 of
Hampel et al. (1986).
10
Example
The following program reads in a set of data consisting of eleven observations of a variable .
The
psi and
chi functions used are Hampel's Piecewise Linear Function and Hubers
chi function respectively.
Using the following starting values various estimates of
and
are calculated and printed along with the number of iterations used:
-
(a)g07dcc determined the starting values, is estimated simultaneously.
-
(b)You must supply the starting values, is estimated simultaneously.
-
(c)g07dcc determined the starting values, is fixed.
-
(d)You must supply the starting values, is fixed.
10.1
Program Text
10.2
Program Data
10.3
Program Results