NAG CL Interfaceg02dfc (linregm_​var_​del)

Settings help

CL Name Style:

1Purpose

g02dfc deletes an independent variable from a general linear regression model.

2Specification

 #include
 void g02dfc (Integer ip, double q[], Integer tdq, Integer indx, double *rss, NagError *fail)
The function may be called by the names: g02dfc, nag_correg_linregm_var_del or nag_regsn_mult_linear_delete_var.

3Description

When selecting a linear regression model it is sometimes useful to drop independent variables from the model and to examine the resulting sub-model. g02dfc updates the $QR$ decomposition used in the computation of the linear regression model. The $QR$ decomposition may come from g02dac, g02dcc, g02dec or a previous call to g02dfc.
For the general linear regression model with $p$ independent variables fitted, g02dac or g02dec computes a $QR$ decomposition of the (weighted) independent variables and forms an upper triangular matrix $R$ and a vector $c$. To remove an independent variable $R$ and $c$ have to be updated. The column of $R$ corresponding to the variable to be dropped is removed and the matrix is then restored to upper triangular form by applying a series of Givens rotations. The rotations are then applied to $c$. Note that only the first $p$ elements of $c$ are affected.
The method used means that while the updated values of $R$ and $c$ are computed an updated value of $Q$ from the $QR$ decomposition is not available so a call to g02dec cannot be made after a call to g02dfc.
g02ddc can be used to calculate the parameter estimates, $\stackrel{^}{\beta }$, from the information provided by g02dfc.

4References

Golub G H and Van Loan C F (1996) Matrix Computations (3rd Edition) Johns Hopkins University Press, Baltimore
Hammarling S (1985) The singular value decomposition in multivariate statistics SIGNUM Newsl. 20(3) 2–25

5Arguments

1: $\mathbf{ip}$Integer Input
On entry: the number of independent variables already in the model, $p$.
Constraint: ${\mathbf{ip}}\ge 1$.
2: $\mathbf{q}\left[{\mathbf{ip}}×{\mathbf{tdq}}\right]$double Input/Output
Note: the $\left(i,j\right)$th element of the matrix $Q$ is stored in ${\mathbf{q}}\left[\left(i-1\right)×{\mathbf{tdq}}+j-1\right]$.
On entry: the results of the $QR$ decomposition as returned by g02dac, g02dcc, g02dec or previous calls to g02dfc.
On exit: the updated $QR$ decomposition. The first ip elements of the first column of q contain the updated value of $c$, the upper triangular part of columns 2 to ip contain the updated $R$ matrix.
3: $\mathbf{tdq}$Integer Input
On entry: the stride separating matrix column elements in the array q.
Constraint: ${\mathbf{tdq}}\ge {\mathbf{ip}}+1$.
4: $\mathbf{indx}$Integer Input
On entry: indicates which independent variable is to be deleted from the model.
Constraint: $1\le {\mathbf{indx}}\le {\mathbf{ip}}$.
5: $\mathbf{rss}$double * Input/Output
On entry: the residual sum of squares for the full regression.
Constraint: ${\mathbf{rss}}\ge 0.0$.
On exit: the residual sum of squares with the (indx)th variable removed. Note that the residual sum of squares will only be valid if the regression is of full rank.
6: $\mathbf{fail}$NagError * Input/Output
The NAG error argument (see Section 7 in the Introduction to the NAG Library CL Interface).

6Error Indicators and Warnings

NE_2_INT_ARG_GT
On entry, ${\mathbf{indx}}=⟨\mathit{\text{value}}⟩$ while ${\mathbf{ip}}=⟨\mathit{\text{value}}⟩$. These arguments must satisfy ${\mathbf{indx}}\le {\mathbf{ip}}$.
NE_2_INT_ARG_LT
On entry, ${\mathbf{tdq}}=⟨\mathit{\text{value}}⟩$ while ${\mathbf{ip}}+1=⟨\mathit{\text{value}}⟩$. These arguments must satisfy ${\mathbf{tdq}}\ge {\mathbf{ip}}+1$.
NE_ALLOC_FAIL
Dynamic memory allocation failed.
NE_DIAG_ELEM_ZERO
On entry, a diagonal element, $⟨\mathit{\text{value}}⟩$, of $R$ is zero.
NE_INT_ARG_LT
On entry, ${\mathbf{indx}}=⟨\mathit{\text{value}}⟩$.
Constraint: ${\mathbf{indx}}\ge 1$.
On entry, ${\mathbf{ip}}=⟨\mathit{\text{value}}⟩$.
Constraint: ${\mathbf{ip}}\ge 1$.
NE_REAL_ARG_LT
On entry, rss must not be less than 0.0: ${\mathbf{rss}}=⟨\mathit{\text{value}}⟩$.

7Accuracy

There will inevitably be some loss in accuracy in fitting a model by dropping terms from a more complex model rather than fitting it afresh using g02dac.

8Parallelism and Performance

g02dfc is not threaded in any implementation.

None.

10Example

A dataset consisting of 12 observations on four independent variables and one dependent variable is read in. The full model, including a mean term, is fitted using g02dac. The value of indx is read in and that variable dropped from the regression. The parameter estimates are calculated by g02ddc and printed. This process is repeated until indx is 0.

10.1Program Text

Program Text (g02dfce.c)

10.2Program Data

Program Data (g02dfce.d)

10.3Program Results

Program Results (g02dfce.r)