PDF version (NAG web site
, 64-bit version, 64-bit version)
NAG Toolbox: nag_correg_robustm_corr_user (g02hm)
Purpose
nag_correg_robustm_corr_user (g02hm) computes a robust estimate of the covariance matrix for user-supplied weight functions. The derivatives of the weight functions are not required.
Syntax
[
user,
covar,
a,
wt,
theta,
nit,
ifail] = g02hm(
ucv,
indm,
x,
a,
theta, 'user',
user, 'n',
n, 'm',
m, 'bl',
bl, 'bd',
bd, 'maxit',
maxit, 'nitmon',
nitmon, 'tol',
tol)
[
user,
covar,
a,
wt,
theta,
nit,
ifail] = nag_correg_robustm_corr_user(
ucv,
indm,
x,
a,
theta, 'user',
user, 'n',
n, 'm',
m, 'bl',
bl, 'bd',
bd, 'maxit',
maxit, 'nitmon',
nitmon, 'tol',
tol)
Note: the interface to this routine has changed since earlier releases of the toolbox:
At Mark 23: |
nitmon and tol were made optional |
At Mark 22: |
n was made optional |
Description
For a set of
observations on
variables in a matrix
, a robust estimate of the covariance matrix,
, and a robust estimate of location,
, are given by
where
is a correction factor and
is a lower triangular matrix found as the solution to the following equations.
and
where |
is a vector of length containing the elements of the th row of , |
|
is a vector of length , |
|
is the identity matrix and is the zero matrix. |
and |
and are suitable functions. |
nag_correg_robustm_corr_user (g02hm) covers two situations:
(i) |
for all , |
(ii) |
. |
The robust covariance matrix may be calculated from a weighted sum of squares and cross-products matrix about
using weights
. In case
(i) a divisor of
is used and in case
(ii) a divisor of
is used. If
, then the robust covariance matrix can be calculated by scaling each row of
by
and calculating an unweighted covariance matrix about
.
In order to make the estimate asymptotically unbiased under a Normal model a correction factor,
, is needed. The value of the correction factor will depend on the functions employed (see
Huber (1981) and
Marazzi (1987)).
nag_correg_robustm_corr_user (g02hm) finds
using the iterative procedure as given by Huber; see
Huber (1981).
and
where
, for
and
is a lower triangular matrix such that
where
- , for
and
and
are suitable bounds.
The value of may be chosen so that is unbiased if the observations are from a given distribution.
nag_correg_robustm_corr_user (g02hm) is based on routines in ROBETH; see
Marazzi (1987).
References
Huber P J (1981) Robust Statistics Wiley
Marazzi A (1987) Weights for bounded influence regression in ROBETH Cah. Rech. Doc. IUMSP, No. 3 ROB 3 Institut Universitaire de Médecine Sociale et Préventive, Lausanne
Parameters
Compulsory Input Parameters
- 1:
– function handle or string containing name of m-file
-
ucv must return the values of the functions
and
for a given value of its argument.
[user, u, w] = ucv(t, user)
Input Parameters
- 1:
– double scalar
-
The argument for which the functions and must be evaluated.
- 2:
– Any MATLAB object
ucv is called from
nag_correg_robustm_corr_user (g02hm) with the object supplied to
nag_correg_robustm_corr_user (g02hm).
Output Parameters
- 1:
– Any MATLAB object
- 2:
– double scalar
-
The value of the
function at the point
t.
- 3:
– double scalar
-
The value of the
function at the point
t.
- 2:
– int64int32nag_int scalar
-
Indicates which form of the function
will be used.
- .
- .
- 3:
– double array
-
ldx, the first dimension of the array, must satisfy the constraint
.
must contain the th observation on the th variable, for and .
- 4:
– double array
-
An initial estimate of the lower triangular real matrix
. Only the lower triangular elements must be given and these should be stored row-wise in the array.
The diagonal elements must be , and in practice will usually be . If the magnitudes of the columns of are of the same order, the identity matrix will often provide a suitable initial value for . If the columns of are of different magnitudes, the diagonal elements of the initial value of should be approximately inversely proportional to the magnitude of the columns of .
Constraint:
, for .
- 5:
– double array
-
An initial estimate of the location argument,
, for
.
In many cases an initial estimate of
, for
, will be adequate. Alternatively medians may be used as given by
nag_univar_robust_1var_median (g07da).
Optional Input Parameters
- 1:
– Any MATLAB object
user is not used by
nag_correg_robustm_corr_user (g02hm), but is passed to
ucv. Note that for large objects it may be more efficient to use a global variable which is accessible from the m-files than to use
user.
- 2:
– int64int32nag_int scalar
-
Default:
the first dimension of the array
x.
, the number of observations.
Constraint:
.
- 3:
– int64int32nag_int scalar
-
Default:
the dimension of the array
theta and the second dimension of the array
x. (An error is raised if these dimensions are not equal.)
, the number of columns of the matrix , i.e., number of independent variables.
Constraint:
.
- 4:
– double scalar
Default:
The magnitude of the bound for the off-diagonal elements of , .
Constraint:
.
- 5:
– double scalar
Default:
The magnitude of the bound for the diagonal elements of , .
Constraint:
.
- 6:
– int64int32nag_int scalar
Default:
The maximum number of iterations that will be used during the calculation of .
Constraint:
.
- 7:
– int64int32nag_int scalar
Default:
Indicates the amount of information on the iteration that is printed.
- The value of , and (see Accuracy) will be printed at the first and every nitmon iterations.
- No iteration monitoring is printed.
When printing occurs the output is directed to the current advisory message channel (See
nag_file_set_unit_advisory (x04ab).)
- 8:
– double scalar
Default:
The relative precision for the final estimate of the covariance matrix. Iteration will stop when maximum
(see
Accuracy) is less than
tol.
Constraint:
.
Output Parameters
- 1:
– Any MATLAB object
- 2:
– double array
-
A robust estimate of the covariance matrix, . The upper triangular part of the matrix is stored packed by columns (lower triangular stored by rows), that is is returned in , .
- 3:
– double array
-
The lower triangular elements of the inverse of the matrix , stored row-wise.
- 4:
– double array
-
contains the weights, , for .
- 5:
– double array
-
Contains the robust estimate of the location argument,
, for .
- 6:
– int64int32nag_int scalar
-
The number of iterations performed.
- 7:
– int64int32nag_int scalar
unless the function detects an error (see
Error Indicators and Warnings).
Error Indicators and Warnings
Errors or warnings detected by the function:
Cases prefixed with W are classified as warnings and
do not generate an error of type NAG:error_n. See nag_issue_warnings.
-
-
On entry, | , |
or | , |
or | , |
or | . |
-
-
On entry, | , |
or | , |
or | diagonal element of , |
or | , |
or | . |
-
-
A column of
x has a constant value.
-
-
Value of
u or
w returned by
.
-
-
The function has failed to converge in
maxit iterations.
- W
-
Either the sum or the sum is zero. This may be caused by the functions or being too strict for the current estimate of (or ). You should either try a larger initial estimate of or make the and functions less strict.
-
An unexpected error has been triggered by this routine. Please
contact
NAG.
-
Your licence key may have expired or may not have been installed correctly.
-
Dynamic memory allocation failed.
Accuracy
On successful exit the accuracy of the results is related to the value of
tol; see
Arguments. At an iteration let
(i) |
the maximum value of |
(ii) |
the maximum absolute change in |
(iii) |
the maximum absolute relative change in |
and let
. Then the iterative procedure is assumed to have converged when
Further Comments
The existence of
will depend upon the function
(see
Marazzi (1987)); also if
is not of full rank a value of
will not be found. If the columns of
are almost linearly related, then convergence will be slow.
If derivatives of the
and
functions are available then the method used in
nag_correg_robustm_corr_user_deriv (g02hl) will usually give much faster convergence.
Example
A sample of observations on three variables is read in along with initial values for and and argument values for the and functions, and . The covariance matrix computed by nag_correg_robustm_corr_user (g02hm) is printed along with the robust estimate of .
ucv computes the Huber's weight functions:
and
Open in the MATLAB editor:
g02hm_example
function g02hm_example
fprintf('g02hm example results\n\n');
indm = int64(1);
x = [3.4, 6.9, 12.2;
6.4, 2.5, 15.1;
4.9, 5.5, 14.2;
7.3, 1.9, 18.2;
8.8, 3.6, 11.7;
8.4, 1.3, 17.9;
5.3, 3.1, 15.0;
2.7, 8.1, 7.7;
6.1, 3.0, 21.9;
5.3, 2.2, 13.9];
[n,m] = size(x);
indm = int64(1);
a = [1;
0; 1;
0; 0; 1];
theta = zeros(m,1);
user = [4, 2];
[user, covar, a, wt, theta, nit, ifail] = ...
g02hm( ...
@ucv, indm, x, a, theta, 'user', user);
fprintf(' iterations to convergence = %4d\n\n', nit);
mtitle = 'Robust covariance matrix';
n = int64(size(x,2));
uplo = 'Upper';
diag = 'Non-unit';
[ifail] = x04cc( ...
uplo, diag, n, covar, mtitle);
fprintf('\n');
disp('Robust estimates of theta');
disp(theta);
function [userp, u, w] = ucv(t, userp)
cu = userp(1);
u = 1.0;
if (t ~= 0)
t2 = t*t;
if (t2 > cu)
u = cu/t2;
end
end
cw = userp(2);
if (t > cw)
w = cw/t;
else
w = 1.0;
end
g02hm example results
iterations to convergence = 34
Robust covariance matrix
1 2 3
1 3.2779 -3.6918 4.7391
2 5.2841 -6.4087
3 11.8373
Robust estimates of theta
5.6998
3.8636
14.7036
PDF version (NAG web site
, 64-bit version, 64-bit version)
© The Numerical Algorithms Group Ltd, Oxford, UK. 2009–2015