NAG CL Interface
g02dcc (linregm_​obs_​edit)

1 Purpose

g02dcc adds or deletes an observation from a general regression model fitted by g02dac.

2 Specification

#include <nag.h>
void  g02dcc (Nag_UpdateObserv update, Nag_IncludeMean mean, Integer m, const Integer sx[], double q[], Integer tdq, Integer ip, const double x[], Integer nr, Integer tdx, Integer ix, double y, const double wt[], double *rss, NagError *fail)
The function may be called by the names: g02dcc, nag_correg_linregm_obs_edit or nag_regsn_mult_linear_addrem_obs.

3 Description

g02dac fits a general linear regression model to a dataset. You may wish to change the model by either adding or deleting an observation from the dataset. g02dcc takes the results from g02dac and makes the required changes to the vector c and the upper triangular matrix R produced by g02dac. The regression coefficients, standard errors and the variance-covariance matrix of the regression coefficients can be obtained from g02ddc after all required changes to the dataset have been made.
g02dac performs a QR decomposition on the (weighted) X matrix of independent variables. To add a new observation to a model with p arguments the upper triangular matrix R and vector c 1 , the first p elements of c , are augmented by the new observation on independent variables in xT and dependent variable y . Givens rotations are then used to restore the upper triangular form.
R : c 1 x y R * c 1 * y * 0  
To delete an observation Givens rotations are applied to give:
R c 1 R * c 1 * x y  
Note: only the R and upper part of the c are updated, the remainder of the Q matrix is unchanged.

4 References

Golub G H and Van Loan C F (1996) Matrix Computations (3rd Edition) Johns Hopkins University Press, Baltimore
Hammarling S (1985) The singular value decomposition in multivariate statistics SIGNUM Newsl. 20(3) 2–25

5 Arguments

1: update Nag_UpdateObserv Input
On entry: indicates if an observation is to be added or deleted.
update=Nag_ObservAdd
The observation is added.
update=Nag_ObservDel
The observation is deleted.
Constraint: update=Nag_ObservAdd or Nag_ObservDel.
2: mean Nag_IncludeMean Input
On entry: indicates if a mean has been used in the model.
mean=Nag_MeanInclude
A mean term or intercept will have been included in the model by g02dac.
mean=Nag_MeanZero
A model with no mean term or intercept will have been fitted by g02dac.
Constraint: mean=Nag_MeanInclude or Nag_MeanZero.
3: m Integer Input
On entry: the total number of independent variables in the dataset.
Constraint: m1 .
4: sx[m] const Integer Input
On entry: if sx[j] is greater than 0, then the value contained in x[ tdx × ix-1 + j ] is to be included as a value of xT , an observation on an independent variable, for j=0,1,,m - 1.
Constraint: if mean=Nag_MeanInclude, then exactly ip-1 elements of sx must be > 0 and if mean=Nag_MeanZero, then exactly ip elements of sx must be > 0 .
5: q[ip×tdq] double Input/Output
Note: the i,jth element of the matrix Q is stored in q[i-1×tdq+j-1].
On entry: q must be array q as output by g02dac, g02dec, g02dfc, or a previous call to g02dcc.
On exit: the first ip elements of the first column of q will contain c 1 * , the upper triangular part of columns 2 to ip+1 will contain R * , the remainder is unchanged.
6: tdq Integer Input
On entry: the stride separating matrix column elements in the array q.
Constraint: tdq ip + 1 .
7: ip Integer Input
On entry: the number of linear terms in general linear regression model (including mean if there is one).
Constraint: ip1 .
8: x[nr×tdx] const double Input
On entry: the ip values for the dependent variables of the observation to be added or deleted, xT . The positions of the values x extracted depends on ix and tdx.
9: nr Integer Input
On entry: the number of rows of the notional two-dimensional array x.
Constraint: nr1 .
10: tdx Integer Input
On entry: the stride separating matrix column elements in the array x.
Constraint: tdxm .
11: ix Integer Input
On entry: the row of the notional two-dimensional array x that contains the values for the dependent variables of the observation to be added or deleted.
Constraint: 1 ix nr .
12: y double Input
On entry: the value of the dependent variable for the observation to be added or deleted, y .
13: wt[1] const double Input
On entry: if the new observation is to be weighted, then wt must contain the weight to be used with the new observation. If wt[0]=0.0 , then the observation is not included in the model. If the new observation is to be unweighted, then wt must be supplied as NULL.
Constraint: if the new observation is to be weighted wt[0]0.0 .
14: rss double * Input/Output
On entry: the value of the residual sums of squares for the original set of observations.
Constraint: rss0.0 .
On exit: the updated values of the residual sums of squares.
Note: this will only be valid if the model is of full rank.
15: fail NagError * Input/Output
The NAG error argument (see Section 7 in the Introduction to the NAG Library CL Interface).

6 Error Indicators and Warnings

NE_2_INT_ARG_GT
On entry, ix=value while nr=value . These arguments must satisfy ixnr .
NE_2_INT_ARG_LT
On entry, tdq=value while ip + 1 = value. These arguments must satisfy tdq ip + 1 .
On entry, tdx=value while m=value . These arguments must satisfy tdxm .
NE_ALLOC_FAIL
Dynamic memory allocation failed.
NE_BAD_PARAM
On entry, mean had an illegal value.
On entry, update had an illegal value.
NE_INT_ARG_LT
On entry, ip=value.
Constraint: ip1.
On entry, ix=value.
Constraint: ix1.
On entry, m=value.
Constraint: m1.
On entry, nr=value.
Constraint: nr1.
NE_IP_INCOMP_WITH_SX
On entry, for mean=Nag_MeanInclude, number of nonzero values of sx must be equal to ip-1 : number of nonzero values of sx=value , ip - 1 = value.
On entry, for mean=Nag_MeanZero, number of nonzero values of sx must be equal to ip: number of nonzero values of sx=value , ip=value .
NE_MAT_NOT_UPD
The R matrix could not be updated: to, either, delete nonexistent observation, or, add an observation to R matrix with zero diagonal element.
NE_REAL_ARG_LT
On entry, rss=value .
Constraint: rss0.0.
On entry, wt[0]=value
Constraint: wt[0]0.0.
NE_RSS_NOT_UPD
The rss could not be updated because the input rss was less than the calculated decrease in rss when the new observation was deleted.

7 Accuracy

Higher accuracy is achieved by updating the R matrix rather than the traditional methods of updating X'X.

8 Parallelism and Performance

g02dcc is not threaded in any implementation.

9 Further Comments

Care should be taken with the use of this function.
  1. (a)It is possible to delete observations which were not included in the original model.
  2. (b)If several additions/deletions have been performed you are advised to recompute the regression using g02dac.
  3. (c)Adding or deleting observations can alter the rank of the model. Such changes will only be detected when a call to g02ddc has been made. g02ddc should also be used to compute the new residual sum of squares when the model is not of full rank.
g02dcc may also be used after g02dec and g02dfc.

10 Example

A dataset consisting of 12 observations with four independent variables is read in and a general linear regression model fitted by g02dac and parameter estimates printed. The last observation is then dropped and the parameter estimates recalculated, using g02ddc, and printed.

10.1 Program Text

Program Text (g02dcce.c)

10.2 Program Data

Program Data (g02dcce.d)

10.3 Program Results

Program Results (g02dcce.r)