g07 Chapter Contents
g07 Chapter Introduction
NAG Library Manual

# NAG Library Function Documentnag_robust_m_estim_1var_usr (g07dcc)

## 1  Purpose

nag_robust_m_estim_1var_usr (g07dcc) computes an $M$-estimate of location with (optional) simultaneous estimation of scale, where you provide the weight functions.

## 2  Specification

 #include #include
void  nag_robust_m_estim_1var_usr (
 double (*chi)(double t, Nag_Comm *comm),
 double (*psi)(double t, Nag_Comm *comm),
Integer isigma, Integer n, const double x[], double beta, double *theta, double *sigma, Integer maxit, double tol, double rs[], Integer *nit, Nag_Comm *comm, NagError *fail)

## 3  Description

The data consists of a sample of size $n$, denoted by ${x}_{1},{x}_{2},\dots ,{x}_{n}$, drawn from a random variable $X$.
The ${x}_{i}$ are assumed to be independent with an unknown distribution function of the form,
 $Fxi-θ/σ$
where $\theta$ is a location argument, and $\sigma$ is a scale argument. $M$-estimators of $\theta$ and $\sigma$ are given by the solution to the following system of equations;
 $∑i=1nψxi-θ^/σ^ = 0 ∑i=1nχxi-θ^/σ^ = n-1β$
where $\psi$ and $\chi$ are user-supplied weight functions, and $\beta$ is a constant. Optionally the second equation can be omitted and the first equation is solved for $\stackrel{^}{\theta }$ using an assigned value of $\sigma ={\sigma }_{c}$.
The constant $\beta$ should be chosen so that $\stackrel{^}{\sigma }$ is an unbiased estimator when ${x}_{\mathit{i}}$, for $\mathit{i}=1,2,\dots ,n$ has a Normal distribution. To achieve this the value of $\beta$ is calculated as:
 $β=Eχ=∫-∞∞χz12πexp-z22dz$
The values of $\psi \left(\frac{{x}_{i}-\stackrel{^}{\theta }}{\stackrel{^}{\sigma }}\right)\stackrel{^}{\sigma }$ are known as the Winsorized residuals.
The equations are solved by a simple iterative procedure, suggested by Huber:
 $σ^k=1βn-1 ∑i=1nχ xi-θ^k-1σ^k-1 σ^k-12$
and
 $θ^k=θ^k- 1+1n ∑i= 1nψ xi-θ^k- 1σ^k σ^k$
or
 $σ^k=σc$
if $\sigma$ is fixed.
The initial values for $\stackrel{^}{\theta }$ and $\stackrel{^}{\sigma }$ may be user-supplied or calculated within nag_robust_m_estim_1var (g07dbc) as the sample median and an estimate of $\sigma$ based on the median absolute deviation respectively.
nag_robust_m_estim_1var_usr (g07dcc) is based upon function LYHALG within the ROBETH library, see Marazzi (1987).
Hampel F R, Ronchetti E M, Rousseeuw P J and Stahel W A (1986) Robust Statistics. The Approach Based on Influence Functions Wiley
Huber P J (1981) Robust Statistics Wiley
Marazzi A (1987) Subroutines for robust estimation of location and scale in ROBETH Cah. Rech. Doc. IUMSP, No. 3 ROB 1 Institut Universitaire de Médecine Sociale et Préventive, Lausanne

## 5  Arguments

1:     chifunction, supplied by the userExternal Function
chi must return the value of the weight function $\chi$ for a given value of its argument. The value of $\chi$ must be non-negative.
The specification of chi is:
 double chi (double t, Nag_Comm *comm)
1:     tdoubleInput
On entry: the argument for which chi must be evaluated.
2:     commNag_Comm *
Pointer to structure of type Nag_Comm; the following members are relevant to chi.
userdouble *
iuserInteger *
pPointer
The type Pointer will be void *. Before calling nag_robust_m_estim_1var_usr (g07dcc) you may allocate memory and initialize these pointers with various quantities for use by chi when called from nag_robust_m_estim_1var_usr (g07dcc) (see Section 3.2.1.1 in the Essential Introduction).
2:     psifunction, supplied by the userExternal Function
psi must return the value of the weight function $\psi$ for a given value of its argument.
The specification of psi is:
 double psi (double t, Nag_Comm *comm)
1:     tdoubleInput
On entry: the argument for which psi must be evaluated.
2:     commNag_Comm *
Pointer to structure of type Nag_Comm; the following members are relevant to psi.
userdouble *
iuserInteger *
pPointer
The type Pointer will be void *. Before calling nag_robust_m_estim_1var_usr (g07dcc) you may allocate memory and initialize these pointers with various quantities for use by psi when called from nag_robust_m_estim_1var_usr (g07dcc) (see Section 3.2.1.1 in the Essential Introduction).
3:     isigmaIntegerInput
On entry: the value assigned to isigma determines whether $\stackrel{^}{\sigma }$ is to be simultaneously estimated.
${\mathbf{isigma}}=0$
The estimation of $\stackrel{^}{\sigma }$ is bypassed and sigma is set equal to ${\sigma }_{c}$.
${\mathbf{isigma}}=1$
$\stackrel{^}{\sigma }$ is estimated simultaneously.
4:     nIntegerInput
On entry: $n$, the number of observations.
Constraint: ${\mathbf{n}}>1$.
5:     x[n]const doubleInput
On entry: the vector of observations, ${x}_{1},{x}_{2},\dots ,{x}_{n}$.
On entry: the value of the constant $\beta$ of the chosen chi function.
Constraint: ${\mathbf{beta}}>0.0$.
On entry: if ${\mathbf{sigma}}>0$, then theta must be set to the required starting value of the estimate of the location argument $\stackrel{^}{\theta }$. A reasonable initial value for $\stackrel{^}{\theta }$ will often be the sample mean or median.
On exit: the $M$-estimate of the location argument $\stackrel{^}{\theta }$.
On entry: the role of sigma depends on the value assigned to isigma as follows.
If ${\mathbf{isigma}}=1$, sigma must be assigned a value which determines the values of the starting points for the calculation of $\stackrel{^}{\theta }$ and $\stackrel{^}{\sigma }$. If ${\mathbf{sigma}}\le 0.0$, then nag_robust_m_estim_1var_usr (g07dcc) will determine the starting points of $\stackrel{^}{\theta }$ and $\stackrel{^}{\sigma }$. Otherwise, the value assigned to sigma will be taken as the starting point for $\stackrel{^}{\sigma }$, and theta must be assigned a relevant value before entry, see above.
If ${\mathbf{isigma}}=0$, sigma must be assigned a value which determines the values of ${\sigma }_{c}$, which is held fixed during the iterations, and the starting value for the calculation of $\stackrel{^}{\theta }$. If ${\mathbf{sigma}}\le 0$, then nag_robust_m_estim_1var_usr (g07dcc) will determine the value of ${\sigma }_{c}$ as the median absolute deviation adjusted to reduce bias (see nag_median_1var (g07dac)) and the starting point for $\theta$. Otherwise, the value assigned to sigma will be taken as the value of ${\sigma }_{c}$ and theta must be assigned a relevant value before entry, see above.
On exit: the $M$-estimate of the scale argument $\stackrel{^}{\sigma }$, if isigma was assigned the value $1$ on entry, otherwise sigma will contain the initial fixed value ${\sigma }_{c}$.
9:     maxitIntegerInput
On entry: the maximum number of iterations that should be used during the estimation.
Suggested value: ${\mathbf{maxit}}=50$.
Constraint: ${\mathbf{maxit}}>0$.
10:   toldoubleInput
On entry: the relative precision for the final estimates. Convergence is assumed when the increments for theta, and sigma are less than ${\mathbf{tol}}×\mathrm{max}\phantom{\rule{0.125em}{0ex}}\left(1.0,{\sigma }_{k-1}\right)$.
Constraint: ${\mathbf{tol}}>0.0$.
11:   rs[n]doubleOutput
On exit: the Winsorized residuals.
12:   nitInteger *Output
On exit: the number of iterations that were used during the estimation.
13:   commNag_Comm *Communication Structure
The NAG communication argument (see Section 3.2.1.1 in the Essential Introduction).
14:   failNagError *Input/Output
The NAG error argument (see Section 3.6 in the Essential Introduction).

## 6  Error Indicators and Warnings

NE_ALLOC_FAIL
Dynamic memory allocation failed.
On entry, argument $⟨\mathit{\text{value}}⟩$ had an illegal value.
NE_FUN_RET_VAL
The chi function returned a negative value: ${\mathbf{chi}}=⟨\mathit{\text{value}}⟩$.
NE_INT
On entry, ${\mathbf{isigma}}=⟨\mathit{\text{value}}⟩$.
Constraint: ${\mathbf{isigma}}=0$ or $1$.
On entry, ${\mathbf{maxit}}=⟨\mathit{\text{value}}⟩$.
Constraint: ${\mathbf{maxit}}>0$.
On entry, ${\mathbf{n}}=⟨\mathit{\text{value}}⟩$.
Constraint: ${\mathbf{n}}>1$.
NE_INTERNAL_ERROR
An internal error has occurred in this function. Check the function call and any array sizes. If the call is correct then please contact NAG for assistance.
NE_REAL
On entry, ${\mathbf{beta}}=⟨\mathit{\text{value}}⟩$.
Constraint: ${\mathbf{beta}}>0.0$.
On entry, ${\mathbf{tol}}=⟨\mathit{\text{value}}⟩$.
Constraint: ${\mathbf{tol}}>0.0$.
NE_REAL_ARRAY_ELEM_CONS
All elements of x are equal.
NE_SIGMA_NEGATIVE
Current estimate of sigma is zero or negative: ${\mathbf{sigma}}=⟨\mathit{\text{value}}⟩$.
NE_TOO_MANY_ITER
Number of iterations required exceeds maxit: ${\mathbf{maxit}}=⟨\mathit{\text{value}}⟩$.
NE_ZERO_RESID
All winsorized residuals are zero.

## 7  Accuracy

On successful exit the accuracy of the results is related to the value of tol, see Section 5.

## 8  Parallelism and Performance

nag_robust_m_estim_1var_usr (g07dcc) is threaded by NAG for parallel execution in multithreaded implementations of the NAG Library.
nag_robust_m_estim_1var_usr (g07dcc) makes calls to BLAS and/or LAPACK routines, which may be threaded within the vendor library used by this implementation. Consult the documentation for the vendor library for further information.

Standard forms of the functions $\psi$ and $\chi$ are given in Hampel et al. (1986)Huber (1981) and Marazzi (1987). nag_robust_m_estim_1var (g07dbc) calculates $M$-estimates using some standard forms for $\psi$ and $\chi$.
When you supply the initial values, care has to be taken over the choice of the initial value of $\sigma$. If too small a value is chosen then initial values of the standardized residuals $\frac{{x}_{i}-{\stackrel{^}{\theta }}_{k}}{\sigma }$ will be large. If the redescending $\psi$ functions are used, i.e., $\psi =0$ if $\left|t\right|>\tau$, for some positive constant $\tau$, then these large values are Winsorized as zero. If a sufficient number of the residuals fall into this category then a false solution may be returned, see page 152 of Hampel et al. (1986).

## 10  Example

The following program reads in a set of data consisting of eleven observations of a variable $X$.
The psi and chi functions used are Hampel's Piecewise Linear Function and Hubers chi function respectively.
Using the following starting values various estimates of $\theta$ and $\sigma$ are calculated and printed along with the number of iterations used:
 (a) nag_robust_m_estim_1var_usr (g07dcc) determined the starting values, $\sigma$ is estimated simultaneously. (b) You must supply the starting values, $\sigma$ is estimated simultaneously. (c) nag_robust_m_estim_1var_usr (g07dcc) determined the starting values, $\sigma$ is fixed. (d) You must supply the starting values, $\sigma$ is fixed.

### 10.1  Program Text

Program Text (g07dcce.c)

### 10.2  Program Data

Program Data (g07dcce.d)

### 10.3  Program Results

Program Results (g07dcce.r)