NAG Library Function Document

nag_hier_mixed_init (g02jcc)

+− Contents

1 Purpose

2 Specification

3 Description

4 References

5 Arguments

6 Error Indicators and Warnings

7 Accuracy

8 Parallelism and Performance

+− 9 Further Comments

9.1 Construction of the fixed effects design matrix, X

9.2 Construction of random effects design matrix, Z

9.3 The rndm argument

10 Example

1 Purpose

nag_hier_mixed_init (g02jcc) preprocesses a dataset prior to fitting a linear mixed effects regression model of the following form via either nag_reml_hier_mixed_regsn (g02jdc) or nag_ml_hier_mixed_regsn (g02jec).

2 Specification

#include <nag.h>

#include <nagg02.h>

void

nag_hier_mixed_init (Nag_OrderType order, Integer n, Integer ncol, const double dat[], Integer pddat, const Integer levels[], const double y[], const double wt[], const Integer fixed[], Integer lfixed, Integer nrndm, const Integer rndm[], Integer lrndm, Integer *nff, Integer *nlsv, Integer *nrf, double rcomm[], Integer lrcomm, Integer icomm[], Integer licomm, NagError *fail)

3 Description

nag_hier_mixed_init (g02jcc) must be called prior to fitting a linear mixed effects regression model with either nag_reml_hier_mixed_regsn (g02jdc) or nag_ml_hier_mixed_regsn (g02jec).

The model fitting functions nag_reml_hier_mixed_regsn (g02jdc) and nag_ml_hier_mixed_regsn (g02jec) fit a model of the following form:

y = X β + Z ν + ε

where	$y$ is a vector of $n$ observations on the dependent variable,
	$X$ is an $n$ by $p$ design matrix of fixed independent variables,
	$β$ is a vector of $p$ unknown fixed effects,
	$Z$ is an $n$ by $q$ design matrix of random independent variables,
	$ν$ is a vector of length $q$ of unknown random effects,
	$ε$ is a vector of length $n$ of unknown random errors,

and

ν

and

ε

are Normally distributed with expectation zero and variance/covariance matrix defined by

Var [\begin{matrix} ν \\ ε \end{matrix}] = [\begin{matrix} G & 0 \\ 0 & R \end{matrix}]

where

R = σ_{R}^{2} I

I

is the

n \times n

identity matrix and

G

is a diagonal matrix.

Case weights can be incorporated into the model by replacing

X

and

Z

with

W_{c}^{1 / 2} X

and

W_{c}^{1 / 2} Z

respectively where

W_{c}

is a diagonal weight matrix.

4 References

None.

5 Arguments

1: order – Nag_OrderTypeInput

On entry: the order argument specifies the two-dimensional storage scheme being used, i.e., row-major ordering or column-major ordering. C language defined storage is specified by

order = Nag_RowMajor

. See Section 3.2.1.3 in the Essential Introduction for a more detailed explanation of the use of this argument.

Constraint:

order = Nag_RowMajor

Nag_ColMajor

2: n – IntegerInput

On entry:

n

, the number of observations.

The effective number of observations, that is the number of observations with nonzero weight (see wt for more detail), must be greater than the number of fixed effects in the model (as returned in nff).

Constraint:

n \geq 1

3: ncol – IntegerInput

On entry: the number of columns in the data matrix, dat.

Constraint:

ncol \geq 0

4: dat[ $\dim$ ] – const doubleInput

Note: the dimension, dim, of the array dat must be at least

$\max (1, pddat \times ncol)$ when $order = Nag_ColMajor$ ;
$\max (1, n \times pddat)$ when $order = Nag_RowMajor$ .

Where

DAT (i, j)

appears in this document, it refers to the array element

$dat [(j - 1) \times pddat + i - 1]$ when $order = Nag_ColMajor$ ;
$dat [(i - 1) \times pddat + j - 1]$ when $order = Nag_RowMajor$ .

On entry: a matrix of data, with

DAT (i, j)

holding the

i

th observation on the

j

th variable. The two design matrices

X

and

Z

are constructed from dat and the information given in fixed (for

X

) and rndm (for

Z

Constraint: if

levels [j - 1] \neq 1, 1 \leq DAT (i, j) \leq levels [j - 1]

5: pddat – IntegerInput

On entry: the stride separating row or column elements (depending on the value of order) in the array dat.

Constraints:

if $order = Nag_ColMajor$ , $pddat \geq n$ ;
if $order = Nag_RowMajor$ , $pddat \geq ncol$ .

6: levels[ncol] – const IntegerInput

On entry:

levels [i - 1]

contains the number of levels associated with the

i

th variable held in dat.

If the

i

th variable is continuous or binary (i.e., only takes the values zero or one) then

levels [i - 1]

must be set to

1

. Otherwise the

i

th variable is assumed to take an integer value between

1

and

levels [i - 1]

, (i.e., the

i

th variable is discrete with

levels [i - 1]

levels).

Constraint:

levels [i - 1] \geq 1

, for

i = 1, 2, \dots, ncol

7: y[n] – const doubleInput

On entry:

y

, the vector of observations on the dependent variable.

8: wt[n] – const doubleInput

On entry: optionally, the weights to be used in the weighted regression.

wt [i - 1] = 0.0

, the

i

th observation is not included in the model, in which case the effective number of observations is the number of observations with nonzero weights.

If weights are not provided then wt must be set to the null pointer, i.e., (double *)0, and the effective number of observations is n.

Constraint: if

wt is not NULL

wt [i - 1] \geq 0.0

, for

i = 1, 2, \dots, n

9: fixed[lfixed] – const IntegerInput

On entry: defines the structure of the fixed effects design matrix,

X

$fixed [0]$: The number of variables, $N_{F}$ , to include as fixed effects (not including the intercept if present).
$fixed [1]$: The fixed intercept flag which must contain $1$ if a fixed intercept is to be included and $0$ otherwise.
$fixed [2 + i - 1]$: The column of DAT holding the $i$ th fixed variable, for $i = 1, 2, \dots, fixed [0]$ .

See Section 9.1 for more details on the construction of

X

Constraints:

$fixed [0] \geq 0$ ;
$fixed [1] = 0 or 1$ ;
$1 \leq fixed [2 + i - 1] \leq ncol$ , for $i = 1, 2, \dots, fixed [0]$ .

10: lfixed – IntegerInput

On entry: length of the vector fixed.

Constraint:

lfixed \geq 2 + fixed [0]

11: nrndm – IntegerInput

On entry: the second dimension of the random effects design matrix RNDM.

Constraint:

nrndm > 0

12: rndm[ $lrndm \times nrndm$ ] – const IntegerInput

Note: where

RNDM (i, j)

appears in this document, it refers to the array element

$rndm [(j - 1) \times lrndm + i - 1]$ when $order = Nag_ColMajor$ ;
$rndm [(i - 1) \times nrndm + j - 1]$ when $order = Nag_RowMajor$ .

On entry:

RNDM (i, j)

defines the structure of the random effects design matrix,

Z

. The

b

th column of RNDM defines a block of columns in the design matrix

Z

$RNDM (1, b)$: The number of variables, $N_{R_{b}}$ , to include as random effects in the $b$ th block (not including the random intercept if present).
$RNDM (2, b)$: The random intercept flag which must contain $1$ if block $b$ includes a random intercept and $0$ otherwise.
$RNDM (2 + i, b)$: The column of DAT holding the $i$ th random variable in the $b$ th block, for $i = 1, 2, \dots, RNDM (1, b)$ .
$RNDM (3 + N_{R_{b}}, b)$: The number of subject variables, $N_{S_{b}}$ , for the $b$ th block. The subject variables define the nesting structure for this block.
$RNDM (3 + N_{R_{b}} + i, b)$: The column of DAT holding the $i$ th subject variable in the $b$ th block, for $i = 1, 2, \dots, RNDM (3 + N_{R_{b}}, b)$ .

See Section 9.2 for more details on the construction of

Z

Constraints:

$RNDM (1, b) \geq 0$ ;
$RNDM (2, b) = 0 or 1$ ;
at least one random variable or random intercept must be specified in each block, i.e., $RNDM (1, b) + RNDM (2, b) > 0$ ;
the column identifiers associated with the random variables must be in the range $1$ to ncol, i.e., $1 \leq RNDM (2 + i, b) \leq ncol$ , for $i = 1, 2, \dots, RNDM (1, b)$ ;
$RNDM (3 + N_{R_{b}}, b) \geq 0$ ;
the column identifiers associated with the subject variables must be in the range $1$ to ncol, i.e., $1 \leq RNDM (3 + N_{R_{b}} + i, b) \leq ncol$ , for $i = 1, 2, \dots, RNDM (3 + N_{R_{b}}, b)$ .

13: lrndm – IntegerInput

On entry: maximum number of entries in any column of RNDM.

Constraint:

lrndm \geq max_{b} (3 + N_{R_{b}} + N_{S_{b}})

14: nff – Integer *Output

On exit:

p

, the number of fixed effects estimated, i.e., the number of columns in the design matrix

X

15: nlsv – Integer *Output

On exit: the number of levels for the overall subject variable (see Section 9.2 for a description of what this means). If there is no overall subject variable,

nlsv = 1

16: nrf – Integer *Output

On exit: the number of random effects estimated in each of the overall subject blocks. The number of columns in the design matrix

Z

is given by

q = nrf \times nlsv

17: rcomm[lrcomm] – doubleCommunication Array

On exit: communication array as required by the analysis functions nag_reml_hier_mixed_regsn (g02jdc) and nag_ml_hier_mixed_regsn (g02jec).

18: lrcomm – IntegerInput

On entry: the dimension of the array rcomm.

Constraint:

lrcomm \geq nrf \times nlsv + nff + nff \times nlsv + nrf \times nlsv + nff + 2

19: icomm[licomm] – IntegerCommunication Array

On exit: if

licomm = 2

icomm [0]

holds the minimum required value for licomm and

icomm [1]

holds the minimum required value for lrcomm, otherwise icomm is a communication array as required by the analysis functions nag_reml_hier_mixed_regsn (g02jdc) and nag_ml_hier_mixed_regsn (g02jec).

20: licomm – IntegerInput

On entry: the dimension of the array icomm.

Constraint:

licomm = 2

licomm \geq 34 + N_{F} \times (MFL + 1) + nrndm \times MNR \times MRL + (LRNDM + 2) \times nrndm + ncol + LDID \times LB,

where

$MNR = \max_{b} (N_{R_{b}})$ ,
$MFL = \max_{i} (levels [fixed [2 + i - 1] - 1])$ ,
$MRL = \max_{b, i} (levels [RNDM (2 + i, b) - 1])$ ,
$LDID = \max_{b} N_{S_{b}}$ ,
$LB = nff + nrf \times nlsv$ , and
$LRNDM = max_{b} (3 + N_{R_{b}} + N_{S_{b}})$

21: fail – NagError *Input/Output

The NAG error argument (see Section 3.6 in the Essential Introduction).

6 Error Indicators and Warnings

NE_ALLOC_FAIL: Dynamic memory allocation failed.
NE_BAD_PARAM: On entry, argument $⟨value⟩$ had an illegal value.
NE_INT: On entry, $lfixed = ⟨value⟩$ .
Constraint: $lfixed \geq ⟨value⟩$ .
On entry, $licomm = ⟨value⟩$ .
Constraint: $licomm \geq ⟨value⟩$ .
On entry, $lrcomm = ⟨value⟩$ .
Constraint: $lrcomm \geq ⟨value⟩$ .
On entry, $lrndm = ⟨value⟩$ .
Constraint: $lrndm \geq ⟨value⟩$ .
On entry, $n = ⟨value⟩$ .
Constraint: $n \geq 1$ .
On entry, $ncol = ⟨value⟩$ .
Constraint: $ncol \geq 0$ .
On entry, $nrndm = ⟨value⟩$ .
Constraint: $nrndm > 0$ .
NE_INT_2: On entry, $pddat = ⟨value⟩$ and $n = ⟨value⟩$ .
Constraint: $pddat \geq n$ .
On entry, $pddat = ⟨value⟩$ and $ncol = ⟨value⟩$ .
Constraint: $pddat \geq ncol$ .
NE_INT_ARRAY: On entry, index of fixed variable $j$ is less than $1$ or greater than $ncol$ : $j = ⟨value⟩$ , index $= ⟨value⟩$ and $ncol = ⟨value⟩$ .
On entry, index of random variable $j$ in random statement $i$ is less than $1$ or greater than $ncol$ : $i = ⟨value⟩$ , $j = ⟨value⟩$ , index $= ⟨value⟩$ and $ncol = ⟨value⟩$ .
On entry, invalid value for fixed intercept flag: value $= ⟨value⟩$ .
On entry, invalid value for random intercept flag for random statement $i$ : $i = ⟨value⟩$ , value $= ⟨value⟩$ .
On entry, $levels [⟨value⟩] = ⟨value⟩$ .
Constraint: $levels [i - 1] \geq 1$ .
On entry, must be at least one parameter, or an intercept in each random statement $i$ : $i = ⟨value⟩$ .
On entry, nesting variable $j$ in random statement $i$ has one level: $j = ⟨value⟩$ , $i = ⟨value⟩$ .
On entry, number of fixed parameters, $⟨value⟩$ is less than zero.
On entry, number of random parameters for random statement $i$ is less than $0$ : $i = ⟨value⟩$ , number of parameters $= ⟨value⟩$ .
On entry, number of subject parameters for random statement $i$ is less than $0$ : $i = ⟨value⟩$ , number of parameters $= ⟨value⟩$ .
NE_INTERNAL_ERROR: An internal error has occurred in this function. Check the function call and any array sizes. If the call is correct then please contact NAG for assistance.
NE_REAL_ARRAY: On entry, no observations due to zero weights.
On entry, variable $j$ of observation $i$ is less than $1$ or greater than $levels [j - 1]$ : $i = ⟨value⟩$ , $j = ⟨value⟩$ , value $= ⟨value⟩$ , $levels [j - 1] = ⟨value⟩$ .
On entry, $wt [⟨value⟩] = ⟨value⟩$ .
Constraint: $wt [i - 1] \geq 0.0$ .
NE_TOO_MANY: On entry, more fixed factors than observations, $n = ⟨value⟩$ .
Constraint: $n \geq ⟨value⟩$ .

7 Accuracy

Not applicable.

8 Parallelism and Performance

Not applicable.

9 Further Comments

9.1 Construction of the fixed effects design matrix, $X$

Let

$N_{F}$ denote the number of fixed variables, that is $fixed [0] = N_{F}$ ;
$F_{j}$ denote the $j$ th fixed variable, that is the vector of values held in the $k$ th column of DAT when $fixed [2 + j - 1] = k$ ;
$F_{i j}$ denote the $i$ th element of $F_{j}$ ;
$L (F_{j})$ denote the number of levels for $F_{j}$ , that is $L (F_{j}) = levels [fixed [2 + j - 1] - 1]$ ;
$D_{v} (F_{j})$ denoted an indicator function that returns a vector of values whose $i$ th element is $1$ if $F_{i j} = v$ and $0$ otherwise.

The design matrix for the fixed effects,

X

, is constructed as follows:

set $k$ to zero and the flag $done_first$ to false;
if a fixed intercept is included, that is fixed[1]=1,
- set the first column of $X$ to a vector of $1$ s;
- set $k = k + 1$ ;
- set $done_first$ to true;
loop over each fixed variable, so for each j=1,2,…,NF,
- if LFj=1,
  - set the $k$ th column of $X$ to be $F_{j}$ ;
  - set $k = k + 1$ ;
- else
  - if done_first is false then
    - set the $L (F_{j})$ columns, $k$ to $k + L (F_{j}) - 1$ , of $X$ to $D_{v} (F_{j})$ , for $v = 1, 2, \dots, L (F_{j})$ ;
    - set $k = k + L (F_{j})$ ;
    - set $done_first$ to true;
  - else
    - set the $L (F_{j}) - 1$ columns, $k$ to $k + L (F_{j}) - 2$ , of $X$ to $D_{v} (F_{j})$ , for $v = 2, 3, \dots, L (F_{j})$ ;
    - set $k = k + L (F_{j}) - 1$ .

The number of columns in the design matrix,

X

, is therefore given by

p = 1 + \sum_{j = 1}^{N_{F}} (levels [fixed [2 + j - 1] - 1] - 1) .

This quantity is returned in nff.

In summary, nag_hier_mixed_init (g02jcc) converts all non-binary categorical variables (i.e., where

L (F_{j}) > 1

) to dummy variables. If a fixed intercept is included in the model then the first level of all such variables is dropped. If a fixed intercept is not included in the model then the first level of all such variables, other than the first, is dropped. The variables are added into the model in the order they are specified in fixed.

9.2 Construction of random effects design matrix, $Z$

Let

$N_{R_{b}}$ denote the number of random variables in the $b$ th random statement, that is $N_{R_{b}} = RNDM (1, b)$ ;
$R_{j b}$ denote the $j$ th random variable from the $b$ th random statement, that is the vector of values held in the $k$ th column of DAT when $RNDM (2 + j, b) = k$ ;
$R_{i j b}$ denote the $i$ th element of $R_{j b}$ ;
$L (R_{j b})$ denote the number of levels for $R_{j b}$ , that is $L (R_{j b}) = levels [RNDM (2 + j, b) - 1]$ ;
$D_{v} (R_{j b})$ denoted an indicator function that returns a vector of values whose $i$ th element is $1$ if $R_{i j b} = v$ and $0$ otherwise;
$N_{S_{b}}$ denote the number of subject variables in the $b$ th random statement, that is $N_{S_{b}} = RNDM (3 + N_{R_{b}}, b)$ ;
$S_{j b}$ denote the $j$ th subject variable from the $b$ th random statement, that is the vector of values held in the $k$ th column of DAT when $RNDM (3 + N_{R_{b}} + j, b) = k$ ;
$S_{i j b}$ denote the $i$ th element of $S_{j b}$ ;
$L (S_{j b})$ denote the number of levels for $S_{j b}$ , that is $L (S_{j b}) = levels [RNDM (3 + N_{R_{b}} + j, b) - 1]$ ;
$I_{b} (s_{1}, s_{2}, \dots, s_{N_{S_{b}}})$ denoted an indicator function that returns a vector of values whose $i$ th element is $1$ if $S_{i j b} = s_{j}$ for all $j = 1, 2, \dots, N_{S_{b}}$ and $0$ otherwise.

The design matrix for the random effects,

Z

, is constructed as follows:

set $k$ to zero;
loop over each random statement, so for each b=1,2,…,nrndm,
- loop over each level of the last subject variable, so for each sNSb=1,2,…,LRNSb b,
  - ⋮
    - loop over each level of the second subject variable, so for each s2=1,2,…,LR2b,
      - loop over each level of the first subject variable, so for each $s_{1} = 1, 2, \dots, L (R_{1 b})$ ,
        if a random intercept is included, that is $RNDM (2, b) = 1$ ,
        set the $k$ th column of $Z$ to $I_{b} (s_{1}, s_{2}, \dots, s_{N_{S_{b}}})$ ;
        set $k = k + 1$ ;
        loop over each random variable in the $b$ th random statement, so for each $j = 1, 2, \dots, N_{R_{b}}$ ,
        if $L (R_{j b}) = 1$ ,
        set the $k$ th column of $Z$ to $R_{j b} \times I_{b} (s_{1}, s_{2}, \dots, s_{N_{S_{b}}})$ where $\times$ indicates an element-wise multiplication between the two vectors, $R_{j b}$ and $I_{b} (\dots)$ ;
        set $k = k + 1$ ;
        else
        set the $L (R_{b j})$ columns, $k$ to $k + L (R_{b j})$ , of $Z$ to $D_{v} (R_{j b}) \times I_{b} (s_{1}, s_{2}, \dots, s_{N_{S_{b}}})$ , for $v = 1, 2, \dots, L (R_{j b})$ . As before, $\times$ indicates an element-wise multiplication between the two vectors, $D_{v} (\dots)$ and $I_{b} (\dots)$ ;
        set $k = k + L (R_{j b})$ .

In summary, each column of RNDM defines a block of consecutive columns in

Z

. nag_hier_mixed_init (g02jcc) converts all non-binary categorical variables (i.e., where

L (R_{j b})

L (S_{j b}) > 1

) to dummy variables. All random variables defined within a column of RNDM are nested within all subject variables defined in the same column of RNDM. In addition each of the subject variables are nested within each other, starting with the first (i.e., each of the

R_{j b}, j = 1, 2, \dots, N_{R_{b}}

are nested within

S_{1 b}

which in turn is nested within

S_{2 b}

, which in turn is nested within

S_{3 b}

, etc.).

If the last subject variable in each column of RNDM are the same (i.e.,

S_{N_{S_{1}} 1} = S_{N_{S_{2}} 2} = \dots = S_{N_{S_{b}} b}

) then all random effects in the model are nested within this variable. In such instances the last subject variable (

S_{N_{S_{1}} 1}

) is called the overall subject variable. The fact that all of the random effects in the model are nested within the overall subject variable means that

Z^{T} Z

is block diagonal in structure. This fact can be utilised to improve the efficiency of the underlying computation and reduce the amount of internal storage required. The number of levels in the overall subject variable is returned in

nlsv = L (S_{N_{S_{1}} 1})

If the last

k

subject variables in each column of RNDM are the same, for

k > 1

then the overall subject variable is defined as the interaction of these

k

variables and

nlsv = \prod_{j = N_{S_{1}} - k + 1}^{N_{S_{1}}} L (S_{j 1}) .

If there is no overall subject variable then

nlsv = 1

The number of columns in the design matrix

Z

is given by

q = nrf \times nlsv

9.3 The rndm argument

To illustrate some additional points about the rndm argument, we assume that we have a dataset with three discrete variables,

V_{1}

V_{2}

and

V_{3}

, with

2, 4

and

3

levels respectively, and that

V_{1}

is in the first column of DAT,

V_{2}

in the second and

V_{3}

the third. Also assume that we wish to fit a model containing

V_{1}

along with

V_{2}

nested within

V_{3}

, as random effects. In order to do this the RNDM matrix requires two columns:

RNDM = (\begin{matrix} 1 & 1 \\ 0 & 0 \\ 1 & 2 \\ 0 & 1 \\ 0 & 3 \end{matrix})

The first column,

(1, 0, 1, 0, 0)

, indicates one random variable (

RNDM (1, 1) = 1

), no intercept (

RNDM (2, 1) = 0

), the random variable is in the first column of DAT (

RNDM (3, 1) = 1

), there are no subject variables; as no nesting is required for

V_{1}

(

RNDM (4, 1) = 0

). The last element in this column is ignored.

The second column,

(1, 0, 2, 1, 3)

, indicates one random variable (

RNDM (1, 2) = 1

), no intercept (

RNDM (2, 2) = 0

), the random variable is in the second column of DAT

(RNDM (3, 2) = 2)

, there is one subject variable (

RNDM (4, 2) = 1

), and the subject variable is in the third column of dat

(RNDM (5, 2) = 3)

The corresponding

Z

matrix would have

14

columns, with

2

coming from

V_{1}

and

12

(

4 \times 3

) from

V_{2}

nested within

V_{3}

. The, symmetric,

Z^{T} Z

matrix has the form

(\begin{matrix} - & - & - & - & - & - & - & - & - & - & - & - & - & - \\ - & - & - & - & - & - & - & - & - & - & - & - & - & - \\ - & - & - & - & - & - & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ - & - & - & - & - & - & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ - & - & - & - & - & - & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ - & - & - & - & - & - & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ - & - & 0 & 0 & 0 & 0 & - & - & - & - & 0 & 0 & 0 & 0 \\ - & - & 0 & 0 & 0 & 0 & - & - & - & - & 0 & 0 & 0 & 0 \\ - & - & 0 & 0 & 0 & 0 & - & - & - & - & 0 & 0 & 0 & 0 \\ - & - & 0 & 0 & 0 & 0 & - & - & - & - & 0 & 0 & 0 & 0 \\ - & - & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & - & - & - & - \\ - & - & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & - & - & - & - \\ - & - & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & - & - & - & - \\ - & - & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & - & - & - & - \end{matrix})

where

0

indicates a structural zero, i.e., it always takes the value

0

, irrespective of the data, and

-

a value that is not a structural zero. The first two rows and columns of

Z^{T} Z

correspond to

V_{1}

. The block diagonal matrix in the 12 rows and columns in the bottom right correspond to

V_{2}

nested within

V_{3}

. With the

4 \times 4

blocks corresponding to the levels of

V_{2}

. There are three blocks as the subject variable (

V_{3}

) has three levels.

The model fitting functions, nag_reml_hier_mixed_regsn (g02jdc) and nag_ml_hier_mixed_regsn (g02jec), use the sweep algorithm to calculate the log likelihood function for a given set of variance components. This algorithm consists of moving down the diagonal elements (called pivots) of a matrix which is similar in structure to

Z^{T} Z

, and updating each element in that matrix. When using the

k

diagonal element of a matrix

A

, an element

a_{i j}, i \neq k, j \neq k

, is adjusted by an amount equal to

a_{i k} a_{i j} / a_{k k}

. This process can be referred to as sweeping on the

k

th pivot. As there are no structural zeros in the first row or column of the above

Z^{T} Z

, sweeping on the first pivot of

Z^{T} Z

would alter each element of the matrix and therefore destroy the structural zeros, i.e., we could no longer guarantee they would be zero.

Reordering the RNDM matrix to

RNDM = (\begin{matrix} 1 & 1 \\ 0 & 0 \\ 2 & 1 \\ 1 & 0 \\ 3 & 0 \end{matrix})

i.e., the swapping the two columns, results in a

Z^{T} Z

matrix of the form

(\begin{matrix} - & - & - & - & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & - & - \\ - & - & - & - & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & - & - \\ - & - & - & - & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & - & - \\ - & - & - & - & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & - & - \\ 0 & 0 & 0 & 0 & - & - & - & - & 0 & 0 & 0 & 0 & - & - \\ 0 & 0 & 0 & 0 & - & - & - & - & 0 & 0 & 0 & 0 & - & - \\ 0 & 0 & 0 & 0 & - & - & - & - & 0 & 0 & 0 & 0 & - & - \\ 0 & 0 & 0 & 0 & - & - & - & - & 0 & 0 & 0 & 0 & - & - \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & - & - & - & - & - & - \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & - & - & - & - & - & - \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & - & - & - & - & - & - \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & - & - & - & - & - & - \\ - & - & - & - & - & - & - & - & - & - & - & - & - & - \\ - & - & - & - & - & - & - & - & - & - & - & - & - & - \end{matrix})

This matrix is identical to the previous one, except the first two rows and columns have become the last two rows and columns. Sweeping a matrix,

A = \{a_{i j}\}

, of this form on the first pivot will only affect those elements

a_{i j}

, where

a_{i 1} \neq 0 ​ and ​ a_{1 j} \neq 0

, which is only the

13

th and

14

th row and columns, and the top left hand block of

4

rows and columns. The block diagonal nature of the first

12

rows and columns therefore greatly reduces the amount of work the algorithm needs to perform.

nag_hier_mixed_init (g02jcc) constructs the

Z^{T} Z

as specified by the RNDM matrix, and does not attempt to reorder it to improve performance. Therefore for best performance some thought is required on what ordering to use. In general it is more efficient to structure RNDM in such a way that the first row relates to the deepest level of nesting, the second to the next level, etc..