NAG CL Interface
g02jcc (mixeff_​hier_​init)

Note: this function is deprecated. Replaced by g02jfc.

1 Purpose

g02jcc preprocesses a dataset prior to fitting a linear mixed effects regression model of the following form via either g02jdc or g02jec.

2 Specification

#include <nag.h>
void  g02jcc (Nag_OrderType order, Integer n, Integer ncol, const double dat[], Integer pddat, const Integer levels[], const double y[], const double wt[], const Integer fixed[], Integer lfixed, Integer nrndm, const Integer rndm[], Integer lrndm, Integer *nff, Integer *nlsv, Integer *nrf, double rcomm[], Integer lrcomm, Integer icomm[], Integer licomm, NagError *fail)
The function may be called by the names: g02jcc, nag_correg_mixeff_hier_init or nag_hier_mixed_init.

3 Description

g02jcc must be called prior to fitting a linear mixed effects regression model with either g02jdc or g02jec.
The model fitting functions g02jdc and g02jec fit a model of the following form:
y=Xβ+Zν+ε  
where y is a vector of n observations on the dependent variable,
X is an n by p design matrix of fixed independent variables,
β is a vector of p unknown fixed effects,
Z is an n by q design matrix of random independent variables,
ν is a vector of length q of unknown random effects,
ε is a vector of length n of unknown random errors,
and ν and ε are Normally distributed with expectation zero and variance/covariance matrix defined by
Var ν ε = G 0 0 R  
where R= σ R 2 I , I is the n×n identity matrix and G is a diagonal matrix.
Case weights can be incorporated into the model by replacing X and Z with Wc1/2X and Wc1/2Z respectively where Wc is a diagonal weight matrix.

4 References

None.

5 Arguments

1: order Nag_OrderType Input
On entry: the order argument specifies the two-dimensional storage scheme being used, i.e., row-major ordering or column-major ordering. C language defined storage is specified by order=Nag_RowMajor. See Section 3.1.3 in the Introduction to the NAG Library CL Interface for a more detailed explanation of the use of this argument.
Constraint: order=Nag_RowMajor or Nag_ColMajor.
2: n Integer Input
On entry: n, the number of observations.
The effective number of observations, that is the number of observations with nonzero weight (see wt for more detail), must be greater than the number of fixed effects in the model (as returned in nff).
Constraint: n1.
3: ncol Integer Input
On entry: the number of columns in the data matrix, dat.
Constraint: ncol0.
4: dat[dim] const double Input
Note: the dimension, dim, of the array dat must be at least
  • max1,pddat×ncol when order=Nag_ColMajor;
  • max1,n×pddat when order=Nag_RowMajor.
where DATi,j appears in this document, it refers to the array element
  • dat[j-1×pddat+i-1] when order=Nag_ColMajor;
  • dat[i-1×pddat+j-1] when order=Nag_RowMajor.
On entry: a matrix of data, with DATi,j holding the ith observation on the jth variable. The two design matrices X and Z are constructed from dat and the information given in fixed (for X) and rndm (for Z).
Constraint: if levels[j-1]1,1DATi,jlevels[j-1].
5: pddat Integer Input
On entry: the stride separating row or column elements (depending on the value of order) in the array dat.
Constraints:
  • if order=Nag_ColMajor, pddatn;
  • if order=Nag_RowMajor, pddatncol.
6: levels[ncol] const Integer Input
On entry: levels[i-1] contains the number of levels associated with the ith variable held in dat.
If the ith variable is continuous or binary (i.e., only takes the values zero or one), then levels[i-1] must be set to 1. Otherwise the ith variable is assumed to take an integer value between 1 and levels[i-1], (i.e., the ith variable is discrete with levels[i-1] levels).
Constraint: levels[i-1]1, for i=1,2,,ncol.
7: y[n] const double Input
On entry: y, the vector of observations on the dependent variable.
8: wt[dim] const double Input
Note: the dimension, dim, of the array wt must be at least
  • n, when wtis notNULL.
On entry: optionally, the weights to be used in the weighted regression.
If wt[i-1]=0.0, the ith observation is not included in the model, in which case the effective number of observations is the number of observations with nonzero weights.
If weights are not provided then wt must be set to NULL and the effective number of observations is n.
Constraint: if wtis notNULL, wt[i-1]0.0, for i=1,2,,n.
9: fixed[lfixed] const Integer Input
On entry: defines the structure of the fixed effects design matrix, X.
fixed[0]
The number of variables, NF, to include as fixed effects (not including the intercept if present).
fixed[1]
The fixed intercept flag which must contain 1 if a fixed intercept is to be included and 0 otherwise.
fixed[2+i-1]
The column of DAT holding the ith fixed variable, for i=1,2,,fixed[0].
See Section 9.1 for more details on the construction of X.
Constraints:
  • fixed[0]0;
  • fixed[1]=0 or 1;
  • 1fixed[2+i-1]ncol, for i=1,2,,fixed[0].
10: lfixed Integer Input
On entry: length of the vector fixed.
Constraint: lfixed2+fixed[0].
11: nrndm Integer Input
On entry: the number of columns in RNDM.
Constraint: nrndm>0.
12: rndm[lrndm×nrndm] const Integer Input
Note: where RNDMi,j appears in this document, it refers to the array element
  • rndm[j-1×lrndm+i-1] when order=Nag_ColMajor;
  • rndm[i-1×nrndm+j-1] when order=Nag_RowMajor.
On entry: RNDMi,j defines the structure of the random effects design matrix, Z. The bth column of RNDM defines a block of columns in the design matrix Z.
RNDM1,b
The number of variables, NRb, to include as random effects in the bth block (not including the random intercept if present).
RNDM2,b
The random intercept flag which must contain 1 if block b includes a random intercept and 0 otherwise.
RNDM2+i,b
The column of DAT holding the ith random variable in the bth block, for i=1,2,,RNDM1,b.
RNDM3+NRb,b
The number of subject variables, NSb, for the bth block. The subject variables define the nesting structure for this block.
RNDM3+NRb+i,b
The column of DAT holding the ith subject variable in the bth block, for i=1,2,,RNDM3+NRb,b.
See Section 9.2 for more details on the construction of Z.
Constraints:
  • RNDM1,b0;
  • RNDM2,b=0 or 1;
  • at least one random variable or random intercept must be specified in each block, i.e., RNDM1,b + RNDM2,b > 0 ;
  • the column identifiers associated with the random variables must be in the range 1 to ncol, i.e., 1 RNDM2+i,b ncol , for i=1,2,,RNDM1,b;
  • RNDM3+NRb,b 0 ;
  • the column identifiers associated with the subject variables must be in the range 1 to ncol, i.e., 1 RNDM3+ N R b +i ,b ncol , for i=1,2,,RNDM3+NRb,b.
13: lrndm Integer Input
On entry: maximum number of entries in any column of RNDM.
Constraint: lrndm max b 3+NRb+NSb .
14: nff Integer * Output
On exit: p, the number of fixed effects estimated, i.e., the number of columns in the design matrix X.
15: nlsv Integer * Output
On exit: the number of levels for the overall subject variable (see Section 9.2 for a description of what this means). If there is no overall subject variable, nlsv=1.
16: nrf Integer * Output
On exit: the number of random effects estimated in each of the overall subject blocks. The number of columns in the design matrix Z is given by q=nrf×nlsv.
17: rcomm[lrcomm] double Communication Array
On exit: communication array as required by the analysis functions g02jdc and g02jec.
18: lrcomm Integer Input
On entry: the dimension of the array rcomm.
Constraint: lrcommnff×nrf+nrf×nrf+nrf×nlsv+nff×nff+nff+2.
19: icomm[licomm] Integer Communication Array
On exit: if licomm=2, icomm[0] holds the minimum required value for licomm and icomm[1] holds the minimum required value for lrcomm, otherwise icomm is a communication array as required by the analysis functions g02jdc and g02jec.
20: licomm Integer Input
On entry: the dimension of the array icomm.
Constraint: licomm=2 or licomm34+ NF×MFL+1+ nrndm×MNR×MRL+LRNDM+2×nrndm+ ncol+LDID×LB,
where
  • MNR = maxb N R b ,
  • MFL=maxi levels[fixed[2+i-1]-1] ,
  • MRL=maxb,i levels[RNDM2+i,b-1] ,
  • LDID=maxb NSb ,
  • LB=nff+nrf×nlsv, and
  • LRNDM= max b 3+NRb+NSb
21: fail NagError * Input/Output
The NAG error argument (see Section 7 in the Introduction to the NAG Library CL Interface).

6 Error Indicators and Warnings

NE_ALLOC_FAIL
Dynamic memory allocation failed.
See Section 3.1.2 in the Introduction to the NAG Library CL Interface for further information.
NE_BAD_PARAM
On entry, argument value had an illegal value.
NE_INT
On entry, lfixed=value.
Constraint: lfixedvalue.
On entry, licomm=value.
Constraint: licommvalue.
On entry, lrcomm=value.
Constraint: lrcommvalue.
On entry, lrndm=value.
Constraint: lrndmvalue.
On entry, n=value.
Constraint: n1.
On entry, ncol=value.
Constraint: ncol0.
On entry, nrndm=value.
Constraint: nrndm>0.
NE_INT_2
On entry, pddat=value and n=value.
Constraint: pddatn.
On entry, pddat=value and ncol=value.
Constraint: pddatncol.
NE_INT_ARRAY
On entry, index of fixed variable j is less than 1 or greater than ncol: j=value, index =value and ncol=value.
On entry, index of random variable j in random statement i is less than 1 or greater than ncol: i=value, j=value, index =value and ncol=value.
On entry, invalid value for fixed intercept flag: value =value.
On entry, invalid value for random intercept flag for random statement i: i=value, value =value.
On entry, levels[value]=value.
Constraint: levels[i-1]1.
On entry, must be at least one parameter, or an intercept in each random statement i: i=value.
On entry, nesting variable j in random statement i has one level: j=value, i=value.
On entry, number of fixed parameters, value is less than zero.
On entry, number of random parameters for random statement i is less than 0: i=value, number of parameters =value.
On entry, number of subject parameters for random statement i is less than 0: i=value, number of parameters =value.
NE_INTERNAL_ERROR
An internal error has occurred in this function. Check the function call and any array sizes. If the call is correct then please contact NAG for assistance.
See Section 7.5 in the Introduction to the NAG Library CL Interface for further information.
NE_NO_LICENCE
Your licence key may have expired or may not have been installed correctly.
See Section 8 in the Introduction to the NAG Library CL Interface for further information.
NE_REAL_ARRAY
On entry, no observations due to zero weights.
On entry, variable j of observation i is less than 1 or greater than levels[j-1]: i=value, j=value, value =value, levels[j-1]=value.
On entry, wt[value]=value.
Constraint: wt[i-1]0.0.
NE_TOO_MANY
On entry, more fixed factors than observations, n=value.
Constraint: nvalue.

7 Accuracy

Not applicable.

8 Parallelism and Performance

g02jcc is not threaded in any implementation.

9 Further Comments

9.1 Construction of the fixed effects design matrix, X

Let
The design matrix for the fixed effects, X, is constructed as follows:
The number of columns in the design matrix, X, is therefore given by
p= 1+ j=1 N F levels[fixed[2+j-1]-1] -1 .  
This quantity is returned in nff.
In summary, g02jcc converts all non-binary categorical variables (i.e., where LFj>1) to dummy variables. If a fixed intercept is included in the model then the first level of all such variables is dropped. If a fixed intercept is not included in the model then the first level of all such variables, other than the first, is dropped. The variables are added into the model in the order they are specified in fixed.

9.2 Construction of random effects design matrix, Z

Let
The design matrix for the random effects, Z, is constructed as follows:
In summary, each column of RNDM defines a block of consecutive columns in Z. g02jcc converts all non-binary categorical variables (i.e., where LRjb or LSjb>1) to dummy variables. All random variables defined within a column of RNDM are nested within all subject variables defined in the same column of RNDM. In addition each of the subject variables are nested within each other, starting with the first (i.e., each of the Rjb,j=1,2,,NRb are nested within S1b which in turn is nested within S2b, which in turn is nested within S3b, etc.).
If the last subject variable in each column of RNDM are the same (i.e., SNS11 = SNS22 = = SNSbb ) then all random effects in the model are nested within this variable. In such instances the last subject variable ( SNS11 ) is called the overall subject variable. The fact that all of the random effects in the model are nested within the overall subject variable means that ZTZ is block diagonal in structure. This fact can be utilised to improve the efficiency of the underlying computation and reduce the amount of internal storage required. The number of levels in the overall subject variable is returned in nlsv=LSNS11.
If the last k subject variables in each column of RNDM are the same, for k>1 then the overall subject variable is defined as the interaction of these k variables and
nlsv= j=NS1-k+1 NS1 LSj1 .  
If there is no overall subject variable then nlsv=1.
The number of columns in the design matrix Z is given by q=nrf×nlsv.

10 Example

See Section 10 in g02jdc and g02jec.