void	g02dgc (Integer n, const double wt[], double rss, Integer ip, Integer rank, double cov[], double q[], Integer tdq, Nag_Boolean svd, const double p[], const double y[], double b[], double se[], double res[], const double com_ar[], NagError fail)

The function may be called by the names: g02dgc, nag_correg_linregm_fit_newvar or nag_regsn_mult_linear_newyvar.

3 Description

g02dgc uses the results given by g02dac to fit the same set of independent variables to a new dependent variable.

g02dac computes a

Q R

decomposition of the matrix of

p

independent variables and also, if the model is not of full rank, a singular value decomposition (SVD). These results can be used to compute estimates of the arguments for a general linear model with a new dependent variable. The

Q R

decomposition leads to the formation of an upper triangular

p \times p

matrix

R

and an

n \times n

orthogonal matrix

Q

. In addition the vector

c = Q^{T} y

(or

Q^{T} W^{1 / 2} y

) is computed. For a new dependent variable,

y_{new}

, g02dgc computes a new value of

c = Q^{T} y_{new}

Q^{T} W^{1 / 2} y_{new}

R

is of full rank, then the least squares parameter estimates,

\hat{β}

, are the solution to:

R \hat{β} = c_{1}

, where

c_{1}

is the first

p

elements of

c

R

is not of full rank, then g02dac will have computed the SVD of

R

R = Q_{*} (\begin{matrix} D & 0 \\ 0 & 0 \end{matrix}) P^{T}

where

D

is a

k \times k

diagonal matrix with nonzero diagonal elements,

k

being the rank of

R

, and

Q_{*}

and

P

are

p \times p

orthogonal matrices. This gives the solution

\hat{β} = P_{1} D^{−1} Q_{*_{1}}^{T} c_{1}

P_{1}

being the first

k

columns of

P

, i.e.,

P = (P_{1} P_{0})

and

Q_{*_{1}}

being the first

k

columns of

Q_{*}

. Details of the SVD are made available by g02dac in the form of the matrix

P^{*}

P^{*} = (\begin{matrix} D^{−1} P_{1}^{T} \\ P_{0}^{T} \end{matrix}) .

The matrix

Q_{*}

is made available through the com_ar argument of g02dac.

In addition to parameter estimates, the new residuals are computed and the variance-covariance matrix of the parameter estimates are found by scaling the variance-covariance matrix for the original regression.

4 References

Golub G H and Van Loan C F (1996) Matrix Computations (3rd Edition) Johns Hopkins University Press, Baltimore

Hammarling S (1985) The singular value decomposition in multivariate statistics SIGNUM Newsl. 20(3) 2–25

Searle S R (1971) Linear Models Wiley

5 Arguments

1: $n$ – Integer Input

On entry: the number of observations,

n

Constraint:

n \geq 2

2: $wt [n]$ – const double Input

On entry: optionally, the weights to be used in the weighted regression.

wt [i - 1] = 0.0

, then the

i

th observation is not included in the model, in which case the effective number of observations is the number of observations with nonzero weights. The values of res and h will be set to zero for observations with zero weights (see g02dac).

If weights are not provided then wt must be set to NULL and the effective number of observations is n.

Constraint: if

wt is not NULL

wt [i - 1] = 0.0

, for

i = 1, 2, \dots, n

3: $rss$ – double * Input/Output

On entry: the residual sum of squares for the original dependent variable.

On exit: the residual sum of squares for the new dependent variable.

4: $ip$ – Integer Input

On entry: the number

p

of independent variables in the model (including the mean if fitted).

Constraint:

1 \leq ip \leq n

5: $rank$ – Integer Input

On entry: the rank of the independent variables, as given by g02dac.

Constraint:

rank > 0

and if

svd = Nag_FALSE

rank = ip

otherwise

rank \leq ip

6: $cov [ip \times (ip + 1) / 2]$ – double Input/Output

On entry: the covariance matrix of the parameter estimates as given by g02dac.

On exit: the upper triangular part of the variance-covariance matrix of the ip parameter estimates given in b. They are stored packed by column, i.e., the covariance between the parameter estimate given in

b [i]

and the parameter estimate given in

b [j]

j \geq i

, is stored in

cov [j (j + 1) / 2 + i]

for

i = 0, 1, \dots, ip - 1

and

j = i, i + 1, \dots, ip - 1

7: $q [n \times tdq]$ – double Input/Output

Note: the

(i, j)

th element of the matrix

Q

is stored in

q [(i - 1) \times tdq + j - 1]

On entry: the results of the

Q R

decomposition as returned by g02dac.

On exit: the first column of q contains the new values of

c

, the remainder of q will be unchanged.

8: $tdq$ – Integer Input

On entry: the stride separating matrix column elements in the array q.

Constraint:

tdq \geq ip + 1

9: $svd$ – Nag_Boolean Input

On entry: indicates if a singular value decomposition was used by g02dac.

$svd = Nag_TRUE$: A singular value decomposition was used by g02dac.
$svd = Nag_FALSE$: A singular value decomposition was not used by g02dac.

10: $p [2 \times ip + ip \times ip]$ – const double Input

On entry: details of the

Q R

decomposition and SVD, if used, as returned in array p by g02dac.

svd = Nag_FALSE

, only the first ip elements of p are used, these will contain details of the Householder vector in the

Q R

decomposition (Sections 2.2.1 and 3.4.6 in the F08 Chapter Introduction).

svd = Nag_TRUE

, the first ip elements of p will contain details of the Householder vector in the

Q R

decomposition (Sections 2.2.1 and 3.4.6 in the F08 Chapter Introduction) and the next ip elements of p contain singular values. The following ip by ip elements contain the matrix

P^{*}

stored by rows.

11: $y [n]$ – const double Input

On entry: the new dependent variable

y_{new}

12: $b [ip]$ – double Output

On exit:

b [i]

i = 0, 1, \dots, ip - 1

contain the least squares estimates of the arguments of the regression model,

\hat{β}

13: $se [ip]$ – double Output

On exit:

se [i]

i = 0, 1, \dots, ip - 1

contain the standard errors of the ip parameter estimates given in b.

14: $res [n]$ – double Output

On exit: the residuals for the new regression model.

15: $com_ar [ip + ip \times ip]$ – const double Input

On entry: if

svd = Nag_TRUE

, com_ar must be unaltered from the previous call to g02dac.

16: $fail$ – NagError * Input/Output

The NAG error argument (see Section 7 in the Introduction to the NAG Library CL Interface).

6 Error Indicators and Warnings

NE_2_INT_ARG_LT: On entry, $n = ⟨ value ⟩$ while $ip = ⟨ value ⟩$ . These arguments must satisfy $n \geq ip$ .

On entry, $tdq = ⟨ value ⟩$ while $ip + 1 = ⟨ value ⟩$ . These arguments must satisfy $tdq \geq ip + 1$ .
NE_INT_ARG_LE: On entry, $rank = ⟨ value ⟩$ .
Constraint: $rank > 0$ .
NE_INT_ARG_LT: On entry, $ip = ⟨ value ⟩$ .
Constraint: $ip \geq 1$ .
NE_REAL_ARG_LE: On entry, rss must not be less than or equal to 0.0: $rss = ⟨ value ⟩$ .
NE_REAL_ARG_LT: On entry, $wt [⟨ value ⟩]$ must not be less than 0.0: $wt [⟨ value ⟩] = ⟨ value ⟩$ .
NE_SVD_RANK_GT_IP: On entry, the Boolean variable, svd, is Nag_TRUE and rank must not be greater than ip: rank = $⟨ value ⟩$ , $ip = ⟨ value ⟩$ .
NE_SVD_RANK_NE_IP: On entry, the Boolean variable, svd, is Nag_FALSE and rank must be equal to ip: $rank = ⟨ value ⟩$ , $ip = ⟨ value ⟩$ .

7 Accuracy

The same accuracy as g02dac is obtained.

8 Parallelism and Performance

Background information to multithreading can be found in the Multithreading documentation.

g02dgc is not threaded in any implementation.

9 Further Comments

The values of the leverages,

h_{i}

, are unaltered by a change in the dependent variable so a call to g02fac can be made using the value of h from g02dac.

9.1 Internal Changes

Internal changes have been made to this function as follows:

At Mark 26.1: The documented minimum length of the array argument com_ar was too large. The documented minimum length was given as $ip \times ip \times 5 \times (ip - 1)$ but the actual minimum length is $ip \times ip + ip$ which is much less for non-trivial cases, $ip > 1$ .
In addition, provided example programs that called g02dgc allocated lengths of $ip \times ip + 5 \times (ip - 1)$ for the array argument, which was also larger than necessary for non-trivial problems.

The g02dgc function document was updated to document the actual minimum length requirement for com_ar, and those example programs that call g02dgc have been updated to allocate the actual minimum length required for com_ar.

For details of all known issues which have been reported for the NAG Library please refer to the Known Issues.

10 Example

A dataset consisting of 12 observations with four independent variables and two dependent variables is read in. A model with all four independent variables is fitted to the first dependent variable by g02dac and the results printed. The model is then fitted to the second dependent variable by g02dgc and those results printed.

g02dg: FL CL CPP AD PY MB

NAG CL Interfaceg02dgc (linregm_​fit_​newvar)

▸▿ Contents

1 Purpose

2 Specification

3 Description

4 References

5 Arguments

6 Error Indicators and Warnings

7 Accuracy

8 Parallelism and Performance

9 Further Comments

9.1 Internal Changes

10 Example

10.1 Program Text

10.2 Program Data

10.3 Program Results

NAG CL Interface
g02dgc (linregm_fit_newvar)