NAG Library Function Document

nag_regsn_mult_linear_newyvar (g02dgc)

+− Contents

1 Purpose

2 Specification

3 Description

4 References

5 Arguments

6 Error Indicators and Warnings

7 Accuracy

8 Parallelism and Performance

9 Further Comments

+− 10 Example

10.1 Program Text

10.2 Program Data

10.3 Program Results

1 Purpose

nag_regsn_mult_linear_newyvar (g02dgc) calculates the estimates of the arguments of a general linear regression model for a new dependent variable after a call to nag_regsn_mult_linear (g02dac).

2 Specification

#include <nag.h>

#include <nagg02.h>

void	nag_regsn_mult_linear_newyvar (Integer n, const double wt[], double rss, Integer ip, Integer rank, double cov[], double q[], Integer tdq, Nag_Boolean svd, const double p[], const double y[], double b[], double se[], double res[], const double com_ar[], NagError fail)

3 Description

nag_regsn_mult_linear_newyvar (g02dgc) uses the results given by nag_regsn_mult_linear (g02dac) to fit the same set of independent variables to a new dependent variable.

nag_regsn_mult_linear (g02dac) computes a

Q R

decomposition of the matrix of

p

independent variables and also, if the model is not of full rank, a singular value decomposition (SVD). These results can be used to compute estimates of the arguments for a general linear model with a new dependent variable. The

Q R

decomposition leads to the formation of an upper triangular

p

p

matrix

R

and an

n

n

orthogonal matrix

Q

. In addition the vector

c = Q^{T} y

(or

Q^{T} W^{1 / 2} y

) is computed. For a new dependent variable,

y_{new}

, nag_regsn_mult_linear_newyvar (g02dgc) computes a new value of

c = Q^{T} y_{new}

Q^{T} W^{1 / 2} y_{new}

R

is of full rank, then the least squares parameter estimates,

\hat{β}

, are the solution to:

R \hat{β} = c_{1}

, where

c_{1}

is the first

p

elements of

c

R

is not of full rank, then nag_regsn_mult_linear (g02dac) will have computed the SVD of

R

R = Q_{*} (\begin{matrix} D & 0 \\ 0 & 0 \end{matrix}) P^{T}

where

D

is a

k

k

diagonal matrix with nonzero diagonal elements,

k

being the rank of

R

, and

Q_{*}

and

P

are

p

p

orthogonal matrices. This gives the solution

\hat{β} = P_{1} D^{- 1} Q_{*_{1}}^{T} c_{1}

P_{1}

being the first

k

columns of

P

, i.e.,

P = (P_{1} P_{0})

and

Q_{*_{1}}

being the first

k

columns of

Q_{*}

. Details of the SVD are made available by nag_regsn_mult_linear (g02dac) in the form of the matrix

P^{*}

P^{*} = (\begin{matrix} D^{- 1} P_{1}^{T} \\ P_{0}^{T} \end{matrix}) .

The matrix

Q_{*}

is made available through the com_ar argument of nag_regsn_mult_linear (g02dac).

In addition to parameter estimates, the new residuals are computed and the variance-covariance matrix of the parameter estimates are found by scaling the variance-covariance matrix for the original regression.

4 References

Golub G H and Van Loan C F (1996) Matrix Computations (3rd Edition) Johns Hopkins University Press, Baltimore

Hammarling S (1985) The singular value decomposition in multivariate statistics SIGNUM Newsl. 20(3) 2–25

Searle S R (1971) Linear Models Wiley

5 Arguments

1: n – IntegerInput

On entry: the number of observations,

n

Constraint:

n \geq 2

2: wt[n] – const doubleInput

On entry: optionally, the weights to be used in the weighted regression.

wt [i - 1] = 0.0

, then the

i

th observation is not included in the model, in which case the effective number of observations is the number of observations with nonzero weights. The values of res and h will be set to zero for observations with zero weights (see nag_regsn_mult_linear (g02dac)).

If weights are not provided then wt must be set to NULL and the effective number of observations is n.

Constraint: if

wt is not NULL

wt [i - 1] = 0.0

, for

i = 1, 2, \dots, n

3: rss – double *Input/Output

On entry: the residual sum of squares for the original dependent variable.

On exit: the residual sum of squares for the new dependent variable.

4: ip – IntegerInput

On entry: the number

p

of independent variables in the model (including the mean if fitted).

Constraint:

1 \leq ip \leq n

5: rank – IntegerInput

On entry: the rank of the independent variables, as given by nag_regsn_mult_linear (g02dac).

Constraint:

rank > 0

and if

svd = Nag_FALSE

rank = ip

otherwise

rank \leq ip

6: cov[ $ip \times (ip + 1) / 2$ ] – doubleInput/Output

On entry: the covariance matrix of the parameter estimates as given by nag_regsn_mult_linear (g02dac).

On exit: the upper triangular part of the variance-covariance matrix of the ip parameter estimates given in b. They are stored packed by column, i.e., the covariance between the parameter estimate given in

b [i]

and the parameter estimate given in

b [j]

j \geq i

, is stored in

cov [j (j + 1) / 2 + i]

for

i = 0, 1, \dots, ip - 1

and

j = i, i + 1, \dots, ip - 1

7: q[ $n \times tdq$ ] – doubleInput/Output

Note: the

(i, j)

th element of the matrix

Q

is stored in

q [(i - 1) \times tdq + j - 1]

On entry: the results of the

Q R

decomposition as returned by nag_regsn_mult_linear (g02dac).

On exit: the first column of q contains the new values of

c

, the remainder of q will be unchanged.

8: tdq – IntegerInput

On entry: the stride separating matrix column elements in the array q.

Constraint:

tdq \geq ip + 1

9: svd – Nag_BooleanInput

On entry: indicates if a singular value decomposition was used by nag_regsn_mult_linear (g02dac).

$svd = Nag_TRUE$: A singular value decomposition was used by nag_regsn_mult_linear (g02dac).
$svd = Nag_FALSE$: A singular value decomposition was not used by nag_regsn_mult_linear (g02dac).

10: p[ $2 \times ip + ip \times ip$ ] – const doubleInput

On entry: details of the

Q R

decomposition and SVD, if used, as returned in array p by nag_regsn_mult_linear (g02dac).

svd = Nag_FALSE

, only the first ip elements of p are used, these will contain details of the Householder vector in the

Q R

decomposition (Sections 2.2.1 and 3.3.6 in the f08 Chapter Introduction).

svd = Nag_TRUE

, the first ip elements of p will contain details of the Householder vector in the

Q R

decomposition (Sections 2.2.1 and 3.3.6 in the f08 Chapter Introduction) and the next ip elements of p contain singular values. The following ip by ip elements contain the matrix

P^{*}

stored by rows.

11: y[n] – const doubleInput

On entry: the new dependent variable

y_{new}

12: b[ip] – doubleOutput

On exit:

b [i]

i = 0, 1, \dots, ip - 1

contain the least squares estimates of the arguments of the regression model,

\hat{β}

13: se[ip] – doubleOutput

On exit:

se [i]

i = 0, 1, \dots, ip - 1

contain the standard errors of the ip parameter estimates given in b.

14: res[n] – doubleOutput

On exit: the residuals for the new regression model.

15: com_ar[ $5 \times (ip - 1) \times ip$ ] – const doubleInput

On entry: if

svd = Nag_TRUE

, com_ar must be unaltered from the previous call to nag_regsn_mult_linear (g02dac).

16: fail – NagError *Input/Output

The NAG error argument (see Section 3.6 in the Essential Introduction).

6 Error Indicators and Warnings

NE_2_INT_ARG_LT: On entry, $n = ⟨value⟩$ while $ip = ⟨value⟩$ . These arguments must satisfy $n \geq ip$ .
On entry, $tdq = ⟨value⟩$ while $ip + 1 = ⟨value⟩$ . These arguments must satisfy $tdq \geq ip + 1$ .
NE_INT_ARG_LE: On entry, $rank = ⟨value⟩$ .
Constraint: $rank > 0$ .
NE_INT_ARG_LT: On entry, $ip = ⟨value⟩$ .
Constraint: $ip \geq 1$ .
NE_REAL_ARG_LE: On entry, rss must not be less than or equal to 0.0: $rss = ⟨value⟩$ .
NE_REAL_ARG_LT: On entry, $wt [⟨value⟩]$ must not be less than 0.0: $wt [⟨value⟩] = ⟨value⟩$ .
NE_SVD_RANK_GT_IP: On entry, the Boolean variable, svd, is Nag_TRUE and rank must not be greater than ip: rank = $⟨value⟩$ , $ip = ⟨value⟩$ .
NE_SVD_RANK_NE_IP: On entry, the Boolean variable, svd, is Nag_FALSE and rank must be equal to ip: $rank = ⟨value⟩$ , $ip = ⟨value⟩$ .

7 Accuracy

The same accuracy as nag_regsn_mult_linear (g02dac) is obtained.

8 Parallelism and Performance

Not applicable.

9 Further Comments

The values of the leverages,

h_{i}

, are unaltered by a change in the dependent variable so a call to nag_regsn_std_resid_influence (g02fac) can be made using the value of h from nag_regsn_mult_linear (g02dac).

10 Example

A dataset consisting of 12 observations with four independent variables and two dependent variables is read in. A model with all four independent variables is fitted to the first dependent variable by nag_regsn_mult_linear (g02dac) and the results printed. The model is then fitted to the second dependent variable by nag_regsn_mult_linear_newyvar (g02dgc) and those results printed.

NAG Library Function Documentnag_regsn_mult_linear_newyvar (g02dgc)

+− Contents

1 Purpose

2 Specification

3 Description

4 References

5 Arguments

6 Error Indicators and Warnings

7 Accuracy

8 Parallelism and Performance

9 Further Comments

10 Example

10.1 Program Text

10.2 Program Data

10.3 Program Results

NAG Library Function Document

nag_regsn_mult_linear_newyvar (g02dgc)