naginterfaces.library.correg.lmm_​init

naginterfaces.library.correg.lmm_init(hlmm, hddesc, hfixed, y, dat, hrndm=None, wt=None)[source]

lmm_init preprocesses a dataset prior to fitting a linear mixed effects regression model via lmm_fit().

Note: this function uses optional algorithmic parameters, see also: blgm.optset, blgm.optget, lmm_fit().

For full information please refer to the NAG Library document for g02jf

https://support.nag.com/numeric/nl/nagdoc_30.3/flhtml/g02/g02jff.html

Parameters
hlmmHandle, modified in place

On entry: must be set to a null Handle or, alternatively, an existing G22 handle may be supplied in which case lmm_init will destroy the supplied G22 handle as if blgm.handle_free had been called.

On exit: holds a G22 handle to the internal data structure containing a description of the model. You must not change the G22 handle other than through the functions in submodule correg or submodule blgm.

hddescHandle

A G22 handle to the internal data structure containing a description of the data matrix, , as returned in by blgm.lm_describe_data.

hfixedHandle

A G22 handle to the internal data structure containing a description of the fixed part of the model as returned in by blgm.lm_formula.

If is a null Handle then the model is assumed to not have a fixed part.

yfloat, array-like, shape

, the vector of observations on the dependent variable.

datfloat, array-like, shape

The data matrix, . By default, , the th value for the th variable, for , for , should be supplied in .

If the option ‘Storage Order’, described in blgm.lm_describe_data, is set to ‘VAROBS’, should be supplied in .

If either , or , for a variable used in the model, is NaN (Not A Number) then that value is treated as missing and the whole observation is excluded from the analysis.

hrndmNone or Handle, list, shape , optional

A series of G22 handles to internal data structures containing a description of the random part of the model as returned in by blgm.lm_formula.

wtNone or float, array-like, shape , optional

Optionally, the diagonal elements of the weight matrix .

If , the th observation is not included in the model and the effective number of observations is the number of observations with nonzero weights.

If weights are not provided then must be set to None, and the effective number of observations is .

Returns
fnlsvint

The number of levels for the overall subject variable in . If there is no overall subject variable, .

nffint

The number of fixed effects estimated in each of the subject blocks. The number of columns, , in the design matrix is given by .

rnlsvint

The number of levels for the overall subject variable in . If there is no overall subject variable, .

nrfint

The number of random effects estimated in each of the subject blocks. The number of columns, , in the design matrix is given by .

nvprint

, the number of variance components being estimated (excluding the overall variance, ). This is defined by the number of terms in the random part of the model, (see Algorithmic Details for details).

commdict, communication object

Communication structure.

Other Parameters
‘Gamma Lower Bound’float

Default

A lower bound for the elements of , where .

‘Gamma Upper Bound’float

Default

An upper bound for the elements of , where .

‘Initial Distance’float

Default

The initial distance from the solution.

When , lmm_fit() passes ‘Initial Distance’ to the solver as .

When , this option is ignored.

‘Initial Value Strategy’int

Default

Controls how lmm_fit() will choose the initial values for the variance components, , if not supplied.

The MIVQUE0 estimates of the variance components based on the likelihood specified by ‘Likelihood’ are used.

The MIVQUE0 estimates based on the maximum likelihood are used, irrespective of the value of ‘Likelihood’.

See Rao (1972) for a description of the minimum variance quadratic unbiased estimators (MIVQUE0).

By default, for small problems, and for large problems .

‘Likelihood’str

Default

‘Likelihood’ defines whether lmm_fit() will use the restricted maximum likelihood (REML) or the maximum likelihood (ML) when fitting the model.

‘Linear Minimization Accuracy’float

Default

The accuracy of the linear minimizations.

When , lmm_fit() passes ‘Linear Minimization Accuracy’ to the solver as .

When , this option is ignored.

‘Line Search Tolerance’float

Default

The line search tolerance.

When , this option is ignored.

When , lmm_fit() passes ‘Line Search Tolerance’ to the solver as ‘Line Search Tolerance’.

‘List’valueless

Option ‘List’ enables printing of each option specification as it is supplied. ‘NoList’ suppresses this printing.

‘NoList’valueless

Default

Option ‘List’ enables printing of each option specification as it is supplied. ‘NoList’ suppresses this printing.

‘Major Iteration Limit’int

Default

The number of major iterations.

When , lmm_fit() passes ‘Major Iteration Limit’ to the solver as . In this case, the default value used is .

When , lmm_fit() passes ‘Major Iteration Limit’ to the solver as ‘Major Iteration Limit’. In this case, the default value used is , where is the number of variance components being estimated (excluding the overall variance, ).

‘Major Print Level’int

Default

The frequency that monitoring information is output to ‘Unit Number’.

When , lmm_fit() passes ‘Major Print Level’ to the solver as . In this case, the default value used is and hence no monitoring information will be output.

When , lmm_fit() passes ‘Major Print Level’ to the solver as ‘Major Print Level’. In this case, the default value used is and hence no monitoring information will be output.

‘Maximum Number of Threads’int

Default

Controls the maximum number of threads used by lmm_fit() in a multithreaded library. By default, the maximum number of available threads are used.

In a library that is not multithreaded, this option has no effect.

‘Minor Iteration Limit’int

Default

The number of minor iterations.

When , this option is ignored.

When , lmm_fit() passes ‘Minor Iteration Limit’ to the solver as ‘Minor Iteration Limit’. In this case, the default value used is , where is the number of variance components being estimated (excluding the overall variance, ).

‘Minor Print Level’int

Default

The frequency that additional monitoring information is output to ‘Unit Number’.

When , this option is ignored.

When , lmm_fit() passes ‘Minor Print Level’ to the solver as ‘Minor Print Level’. The default value of means that no additional monitoring information will be output.

‘Optimality Tolerance’float

Default

The optimality tolerance.

When , this option is ignored.

When , lmm_fit() passes ‘Optimality Tolerance’ to the solver as ‘Optimality Tolerance’.

‘Parallelisation Strategy’int

Default

If then ‘Parallelisation Strategy’ controls how lmm_fit() is parallelised in a multithreaded library.

lmm_fit() will attempt to parallelise operations involving , even if .

lmm_fit() will only attempt to parallelise operations involving , if .

By default, , however, for some models / datasets, this may be slower than using when .

In a library that is not multithreaded, this option has no effect.

‘Solution Accuracy’float

Default

The accuracy to which the solution is required.

When , lmm_fit() passes ‘Solution Accuracy’ to the solver as .

When , this option is ignored.

‘Solver’str

Default

Controls which solver lmm_fit() will use when fitting the model. By default, is used for small problems and , otherwise.

If , then the solver used is the one implemented in opt.bounds_mod_deriv2_comp and if , then the solver used is the one implemented in opt.nlp1_solve.

‘Sweep Tolerance’float

Default

The sweep tolerance used by lmm_fit() when performing the sweep operation Wolfinger et al. (1994). The default value used is , where .

‘Unit Number’int

Default

The monitoring unit number to which lmm_fit() will send any monitoring information.

Raises
NagValueError
(errno )

On entry, is not a null Handle or a recognised G22 handle.

(errno )

has not been initialized or is corrupt.

(errno )

is not a G22 handle as generated by blgm.lm_describe_data.

(errno )

has not been initialized or is corrupt.

(errno )

is not a G22 handle as generated by blgm.lm_formula.

(errno )

A variable name used when creating is not present in .

Variable name: .

(errno )

On entry, .

Constraint: .

(errno )

.

has not been initialized or is corrupt.

(errno )

.

is not a G22 handle as generated by blgm.lm_formula.

(errno )

No model has been specified.

(errno )

A variable name used when creating is not present in .

Variable name: .

(errno )

On entry, and .

Constraint: or .

(errno )

On entry, .

Constraint: .

(errno )

On entry, and .

Constraint: .

(errno )

On entry, no observations due to zero weights or missing values.

(errno )

On entry, and .

Constraint: .

(errno )

On entry, column of the data matrix, , is not consistent with information supplied in , .

(errno )

On entry, and .

Constraint: .

(errno )

On entry, and .

Constraint: .

(errno )

On entry, and .

Constraint: .

(errno )

On entry, and .

Constraint: .

Warns
NagAlgorithmicWarning
(errno )

The fixed part of the model contains categorical variables, but no intercept or main effects terms have been requested.

(errno )

Column of the data matrix, , required rounding more than expected when being treated as a categorical variable, .

Notes

lmm_init must be called prior to fitting a linear mixed effects regression model via lmm_fit().

The model is of the form:

where

is a vector of observations on the dependent variable,

is an design matrix of fixed independent variables,

is a vector of unknown fixed effects,

is an design matrix of random independent variables,

is a vector of length of unknown random effects,

is a vector of length of unknown random errors.

Both and are assumed to have a Gaussian distribution with expectation zero and variance/covariance matrix defined by

where , is the identity matrix and is a diagonal matrix. It is assumed that the random variables, , can be subdivided into groups with each group being identically distributed with expectation zero and variance . The diagonal elements of matrix , therefore, take one of the values , depending on which group the associated random variable belongs to.

The model, therefore, contains three sets of unknowns: the fixed effects , the random effects and a vector of variance components , where .

Case weights can be incorporated into the model by replacing and with and respectively where is a diagonal weight matrix.

The design matrices, and , are constructed from an data matrix, , a description of the fixed independent variables, , and a description of the random independent variables, . See Algorithmic Details for further details.

References

Rao, C R, 1972, Estimation of variance and covariance components in a linear model, J. Am. Stat. Assoc. (67), 112–115

Wolfinger, R, Tobias, R and Sall, J, 1994, Computing Gaussian likelihoods and their derivatives for general linear mixed models, SIAM Sci. Statist. Comput. (15), 1294–1310