NAG FL Interface
g02jhf (lmm_​fit)

Settings help

FL Name Style:


FL Specification Language:


1 Purpose

g02jhf fits a multi-level linear mixed effects regression model using restricted maximum likelihood (REML) or maximum likelihood (ML). Prior to calling g02jhf the initialization routine g02jff must be called.

2 Specification

Fortran Interface
Subroutine g02jhf ( hlmm, nvpr, gamma, effn, rnkx, ncov, lnlike, lb, b, se, czz, ldczz, cxx, ldcxx, cxz, ldcxz, rcomm, icomm, ifail)
Integer, Intent (In) :: nvpr, lb, ldczz, ldcxx, ldcxz
Integer, Intent (Inout) :: icomm(*), ifail
Integer, Intent (Out) :: effn, rnkx, ncov
Real (Kind=nag_wp), Intent (Inout) :: gamma(nvpr+1), czz(ldczz,rnlsv*nrf), cxx(ldcxx,fnlsv*nff), cxz(ldcxz,fnlsv*nff), rcomm(*)
Real (Kind=nag_wp), Intent (Out) :: lnlike, b(lb), se(lb)
Type (c_ptr), Intent (In) :: hlmm
C Header Interface
#include <nag.h>
void  g02jhf_ (void **hlmm, const Integer *nvpr, double gamma[], Integer *effn, Integer *rnkx, Integer *ncov, double *lnlike, const Integer *lb, double b[], double se[], double czz[], const Integer *ldczz, double cxx[], const Integer *ldcxx, double cxz[], const Integer *ldcxz, double rcomm[], Integer icomm[], Integer *ifail)
The routine may be called by the names g02jhf or nagf_correg_lmm_fit.

3 Description

g02jhf fits a model of the form:
y=Xβ+Zν+ε  
where y is a vector of n observations on the dependent variable,
X is a known n×p design matrix for the fixed independent variables,
β is a vector of length p of unknown fixed effects,
Z is a known n×q design matrix for the random independent variables,
ν is a vector of length q of unknown random effects,
and ε is a vector of length n of unknown random errors.
Both ν and ε are assumed to have a Gaussian distribution with expectation zero and variance/covariance matrix defined by
Var[ ν ε ] = [ G 0 0 R ]  
where R= σ R 2 I , I is the n×n identity matrix and G is a diagonal matrix. It is assumed that the random variables, Z , can be subdivided into g q groups with each group being identically distributed with expectation zero and variance σi2 . The diagonal elements of matrix G , therefore, take one of the values {σi2:i=1,2,,g} , depending on which group the associated random variable belongs to.
The model, therefore, contains three sets of unknowns: the fixed effects β, the random effects ν and a vector of g+1 variance components γ , where γ = {σ12,σ22,, σ g-1 2 ,σg2,σR2} . Rather than working directly with γ , g02jhf uses an iterative process to estimate γ* = { σ12 / σR2 , σ22 / σR2 ,, σg-12 / σR2 , σg2 / σR2 ,1} . Due to the iterative nature of the estimation a set of initial values, γ0 , for γ* is required. g02jhf allows these initial values either to be supplied by you or calculated from the data using the minimum variance quadratic unbiased estimators (MIVQUE0) suggested by Rao (1972).
g02jhf fits the model by maximizing the restricted log-likelihood function:
−2 l R = log(|V|) + (n-p) log( r TV-1r) + log| X TV-1X| + (n-p) (1+log(2π/(n-p)))  
or the log-likelihood function:
−2 l R = log(|V|) + n log( r TV-1r) + log(2π/n)  
where
V = ZG ZT + R,   r=y-Xb   and   b = (XTV-1X)-1 XT V-1 y .  
By default the restricted log-likelihood function is used, the log-likelihood function can be chosen through the optional parameter Solver as detailed in the documentation for g02jff.
Once the final estimates for γ * have been obtained, the value of σR2 is given by
σR2 = (rTV-1r) / (n-p) .  
Case weights, Wc , can be incorporated into the model by replacing XTX and ZTZ with XTWcX and ZTWcZ respectively, for a diagonal weight matrix Wc .
The log-likelihood, lR, is calculated using the sweep algorithm detailed in Wolfinger et al. (1994).

4 References

Goodnight J H (1979) A tutorial on the SWEEP operator The American Statistician 33(3) 149–158
Harville D A (1977) Maximum likelihood approaches to variance component estimation and to related problems JASA 72 320–340
Rao C R (1972) Estimation of variance and covariance components in a linear model J. Am. Stat. Assoc. 67 112–115
Stroup W W (1989) Predictable functions and prediction space in the mixed model procedure Applications of Mixed Models in Agriculture and Related Disciplines Southern Cooperative Series Bulletin No. 343 39–48
Wolfinger R, Tobias R and Sall J (1994) Computing Gaussian likelihoods and their derivatives for general linear mixed models SIAM Sci. Statist. Comput. 15 1294–1310

5 Arguments

Note: prior to calling g02jhf the initialization routine g02jff must be called, therefore, this documentation should be read in conjunction with the document for g02jff. In particular some argument names and conventions described in that document are also relevant here, but their definition has not been repeated. Specifically, hlmm should be interpreted identically in both routines.
1: hlmm Type (c_ptr) Input
On entry: a G22 handle to the internal data structure containing a description of the required model as returned in hlmm by g02jff.
2: nvpr Integer Input
On entry: g, the number of variance components being estimated (excluding the overall variance, σR2).
This should be set to the value of nvpr returned by g02jff.
3: gamma(nvpr+1) Real (Kind=nag_wp) array Input/Output
On entry: holds the initial values of the variance components, γ0 , with gamma(i) the initial value for σi2/σR2, for i=1,2,,nvpr.
If gamma(1)=-1.0, the remaining elements of gamma are ignored and the initial values for the variance components are estimated from the data using MIVQUE0.
On exit: gamma(i), for i=1,2,,nvpr, holds the final estimate of σi2 and gamma(nvpr+1) holds the final estimate for σR2.
Labels for the variance components can be obtained using g22ydf.
Constraint: gamma(1)=-1.0 or gamma(i)0.0, for i=1,2,,g.
4: effn Integer Output
On exit: effective number of observations. If there are no weights (i.e., weight='U'), or all weights are nonzero, effn=n.
5: rnkx Integer Output
On exit: the rank of the design matrix, X, for the fixed effects.
6: ncov Integer Output
On exit: number of variance components not estimated to be zero. If none of the variance components are estimated to be zero, ncov=nvpr.
7: lnlike Real (Kind=nag_wp) Output
On exit: - 2 lR (γ^) where lR is the log of the restricted maximum likelihood calculated at γ^ , the estimated variance components returned in gamma.
8: lb Integer Input
On entry: the dimension of the arrays b and se as declared in the (sub)program from which g02jhf is called.
Constraint: lbfnlsv×nff+rnlsv×nrf.
9: b(lb) Real (Kind=nag_wp) array Output
On exit: the parameter estimates, with the first nrf×rnlsv elements of b containing the parameter estimates for the random effects, ν, and the remaining nff×fnlsv elements containing the parameter estimates for the fixed effects, β.
Labels for the parameter estimates can be obtained using g22ydf.
10: se(lb) Real (Kind=nag_wp) array Output
On exit: the standard errors of the parameter estimates given in b.
11: czz(ldczz,rnlsv×nrf) Real (Kind=nag_wp) array Output
On exit: if rnlsv=1, czz holds the lower triangular portion of the matrix (1/ σ 2 ) (ZTR^-1Z+G^-1) , where R^ and G^ are the estimates of R and G respectively.
If rnlsv>1, then czz holds this matrix in compressed form, with the first nrf columns holding the part of the matrix corresponding to the first level of the overall random subject variable, the next nrf columns holding the part of the matrix corresponding to the second level of the overall random subject variable etc.
If ldczz=0, czz is not referenced.
12: ldczz Integer Input
On entry: the first dimension of the array czz as declared in the (sub)program from which g02jhf is called.
Constraints:
  • ldczz=0 or ldczznrf;
  • if ldczz=0, ldcxx=ldcxz=0.
13: cxx(ldcxx,fnlsv×nff) Real (Kind=nag_wp) array Output
On exit: if fnlsv=1, cxx holds the lower triangular portion of the matrix (1/ σ 2 ) (XTV^-1X) , where V^ is the estimated value of V.
If fnlsv>1, then cxx holds this matrix in compressed form, with the first nff columns holding the part of the matrix corresponding to the first level of the overall fixed subject variable, the next nff columns holding the part of the matrix corresponding to the second level of the overall fixed subject variable, etc.
If ldcxx=0, cxx is not referenced.
14: ldcxx Integer Input
On entry: the first dimension of the array cxx as declared in the (sub)program from which g02jhf is called.
Constraint: ldcxx=0 or ldcxxnff.
15: cxz(ldcxz,fnlsv×nff) Real (Kind=nag_wp) array Output
On exit: cxz holds the matrix (1/σ2) (XTV^-1Z) G^ , where V^ and G^ are the estimates of V and G respectively.
If ldcxz=0, cxz is not referenced.
16: ldcxz Integer Input
On entry: the first dimension of the array cxz as declared in the (sub)program from which g02jhf is called.
Constraint: ldcxz=0 or ldcxzfnlsv×nff.
17: rcomm(*) Real (Kind=nag_wp) array Communication Array
On entry: communication array initialized by a call to g02jff.
18: icomm(*) Integer array Communication Array
On entry: communication array initialized by a call to g02jff.
19: ifail Integer Input/Output
On entry: ifail must be set to 0, −1 or 1 to set behaviour on detection of an error; these values have no effect when no error is detected.
A value of 0 causes the printing of an error message and program execution will be halted; otherwise program execution continues. A value of −1 means that an error message is printed while a value of 1 means that it is not.
If halting is not appropriate, the value −1 or 1 is recommended. If message printing is undesirable, then the value 1 is recommended. Otherwise, the value 0 is recommended. When the value -1 or 1 is used it is essential to test the value of ifail on exit.
On exit: ifail=0 unless the routine detects an error or a warning has been flagged (see Section 6).

6 Error Indicators and Warnings

If on entry ifail=0 or −1, explanatory error messages are output on the current error message unit (as defined by x04aaf).
Errors or warnings detected by the routine:
ifail=11
hlmm has not been initialized or is corrupt.
ifail=12
hlmm is not a G22 handle as generated by g02jff.
ifail=21
On entry, nvpr=value.
Constraint: nvprvalue.
ifail=31
On entry, gamma(value)=value.
Constraint: gamma(1)=-1.0 or gamma(i)0.0.
ifail=81
On entry, lb=value.
Constraint: lbvalue.
ifail=121
On entry, ldczz=value.
Constraint: ldczzvalue.
ifail=141
On entry, ldcxx=value.
Constraint: ldcxxvalue.
ifail=161
On entry, ldcxz=value.
Constraint: ldcxzvalue.
ifail=171
On entry, the communication arrays, icomm and rcomm, have not been initialized correctly.
ifail=1001
Optimal solution found, but requested accuracy not achieved.
ifail=1002
Too many major iterations.
ifail=1003
Current point cannot be improved upon.
ifail=1004
At least one negative estimate for gamma was obtained. All negative estimates have been set to zero.
ifail=-99
An unexpected error has been triggered by this routine. Please contact NAG.
See Section 7 in the Introduction to the NAG Library FL Interface for further information.
ifail=-399
Your licence key may have expired or may not have been installed correctly.
See Section 8 in the Introduction to the NAG Library FL Interface for further information.
ifail=-999
Dynamic memory allocation failed.
See Section 9 in the Introduction to the NAG Library FL Interface for further information.

7 Accuracy

Not applicable.

8 Parallelism and Performance

Background information to multithreading can be found in the Multithreading documentation.
g02jhf is threaded by NAG for parallel execution in multithreaded implementations of the NAG Library.
g02jhf makes calls to BLAS and/or LAPACK routines, which may be threaded within the vendor library used by this implementation. Consult the documentation for the vendor library for further information.
Please consult the X06 Chapter Introduction for information on how to control and interrogate the OpenMP environment used within this routine. Please also consult the Users' Note for your implementation for any additional implementation-specific information.

9 Further Comments

If g02jgf has been called, then rather than use the values of fnlsv, nff, rnlsv, nrf and nvpr as returned by g02jff they should be obtained by calling g02zlf. See Section 9 in g02jgf for more details.

10 Example

This example fits a random effects model with three random submodels and two fixed effects to a simulated dataset with 90 observations and 12 variables. The model is fit using restricted maximum likelihood (REML). Custom labels for the parameter estimates and variance components are then constructed from information returned by g22ydf. See g02jff for an example that uses the standard parameter labels directly.

10.1 Program Text

Program Text (g02jhfe.f90)

10.2 Program Data

Program Data (g02jhfe.d)

10.3 Program Results

Program Results (g02jhfe.r)