PDF version (NAG web site
, 64-bit version, 64-bit version)
NAG Toolbox: nag_correg_linregm_var_del (g02df)
Purpose
nag_correg_linregm_var_del (g02df) deletes an independent variable from a general linear regression model.
Syntax
Note: the interface to this routine has changed since earlier releases of the toolbox:
At Mark 22: |
ip was made optional |
Description
When selecting a linear regression model it is sometimes useful to drop independent variables from the model and to examine the resulting sub-model.
nag_correg_linregm_var_del (g02df) updates the
decomposition used in the computation of the linear regression model. The
decomposition may come from
nag_correg_linregm_fit (g02da) or
nag_correg_linregm_var_add (g02de), or a previous call to
nag_correg_linregm_var_del (g02df).
For the general linear regression model with
independent variables fitted
nag_correg_linregm_fit (g02da) or
nag_correg_linregm_var_add (g02de) compute a
decomposition of the (weighted) independent variables and form an upper triangular matrix
and a vector
. To remove an independent variable
and
have to be updated. The column of
corresponding to the variable to be dropped is removed and the matrix is then restored to upper triangular form by applying a series of Givens rotations. The rotations are then applied to
. Note only the first
elements of
are affected.
The method used means that while the updated values of
and
are computed an updated value of
from the
decomposition is not available so a call to
nag_correg_linregm_var_add (g02de) cannot be made after a call to
nag_correg_linregm_var_del (g02df).
nag_correg_linregm_update (g02dd) can be used to calculate the parameter estimates,
, from the information provided by
nag_correg_linregm_var_del (g02df).
References
Golub G H and Van Loan C F (1996) Matrix Computations (3rd Edition) Johns Hopkins University Press, Baltimore
Hammarling S (1985) The singular value decomposition in multivariate statistics SIGNUM Newsl. 20(3) 2–25
Parameters
Compulsory Input Parameters
- 1:
– double array
-
ldq, the first dimension of the array, must satisfy the constraint
.
- 2:
– int64int32nag_int scalar
-
Indicates which independent variable is to be deleted from the model.
Constraint:
.
-
The residual sum of squares for the full regression.
Constraint:
.
Optional Input Parameters
- 1:
– int64int32nag_int scalar
-
Default:
the dimension of the array
q and the first dimension of the array
q. (An error is raised if these dimensions are not equal.)
, the number of independent variables already in the model.
Constraint:
.
Output Parameters
- 1:
– double array
-
The updated decomposition.
-
The residual sum of squares with the (
indx)th variable removed. Note that the residual sum of squares will only be valid if the regression is of full rank, otherwise the residual sum of squares should be obtained using
nag_correg_linregm_update (g02dd).
- 3:
– int64int32nag_int scalar
unless the function detects an error (see
Error Indicators and Warnings).
Error Indicators and Warnings
Errors or warnings detected by the function:
-
-
On entry, | , |
or | , |
or | , |
or | , |
or | . |
-
-
On entry, | a diagonal element of is zero. |
-
An unexpected error has been triggered by this routine. Please
contact
NAG.
-
Your licence key may have expired or may not have been installed correctly.
-
Dynamic memory allocation failed.
Accuracy
There will inevitably be some loss in accuracy in fitting a model by dropping terms from a more complex model rather than fitting it afresh using
nag_correg_linregm_fit (g02da).
Further Comments
None.
Example
A dataset consisting of
observations on four independent variables and one dependent variable is read in. The full model, including a mean term, is fitted using
nag_correg_linregm_fit (g02da). The value of
indx is read in and that variable dropped from the regression. The parameter estimates are calculated by
nag_correg_linregm_update (g02dd) and printed. This process is repeated until
indx is
.
Open in the MATLAB editor:
g02df_example
function g02df_example
fprintf('g02df example results\n\n');
x = [ 1.0 1.4 0.0 0.0;
1.5 2.2 0.0 0.0;
2.0 4.5 0.0 0.0;
2.5 6.1 0.0 0.0;
3.0 7.1 0.0 0.0;
3.5 7.7 0.0 0.0;
4.0 8.3 1.0 4.0;
4.5 8.6 1.0 4.5;
5.0 8.8 1.0 5.0;
5.5 9.0 1.0 5.5;
6.0 9.3 1.0 6.0;
6.5 9.2 1.0 6.5];
y = [ 4.32 5.21 6.49 7.10 7.94 8.53 ...
8.84 9.02 9.27 9.43 9.68 9.83];
[n,m] = size(x);
isx = ones(m,1,'int64');
mean_p = 'M';
ip = int64(m+1);
[rss, idf, b, se, covar, res, h, q, svd, irank, p, wk, ifail] = ...
g02da(mean_p, x, isx, ip, y);
if svd
fprintf('Model not of full rank\n\n', irank);
end
fprintf('Residual sum of squares = %12.4e\n', rss);
fprintf('Degrees of freedom = %4d\n', idf);
indx = int64([2, 4]);
perm = double([1:ip]);
for j = 1:numel(indx)
[q, rss, ifail] = g02df( ...
q, indx(j), rss, 'ip', ip);
perm(indx(j):ip-1) = perm(indx(j)+1:ip);
ip = ip - 1;
fprintf('\nVariable %2d dropped\n', indx(j));
[rss, idf, b, se, covar, svd, irank, p, ifail] = ...
g02dd(int64(n), ip, q, rss);
if svd
fprintf('Model not of full rank, rank = %4d\n\n', irank);
end
fprintf('Residual sum of squares = %12.4e\n', rss);
fprintf('Degrees of freedom = %4d\n', idf);
fprintf('\nVariable Parameter estimate Standard error\n\n');
ivar = perm(1:ip)';
fprintf('%6d%20.4e%20.4e\n',[ivar b se]');
end
g02df example results
Residual sum of squares = 8.4066e-02
Degrees of freedom = 7
Variable 2 dropped
Residual sum of squares = 2.1239e-01
Degrees of freedom = 8
Variable Parameter estimate Standard error
1 3.6372e+00 1.5083e-01
3 6.1264e-01 2.8007e-02
4 -6.0154e-01 4.2335e-01
5 1.6709e-01 7.8656e-02
Variable 4 dropped
Residual sum of squares = 3.3220e-01
Degrees of freedom = 9
Variable Parameter estimate Standard error
1 3.5974e+00 1.7647e-01
3 6.2088e-01 3.2706e-02
4 2.4247e-01 1.7235e-01
PDF version (NAG web site
, 64-bit version, 64-bit version)
© The Numerical Algorithms Group Ltd, Oxford, UK. 2009–2015