naginterfaces.library.correg.lmm_init¶
- naginterfaces.library.correg.lmm_init(hlmm, hddesc, hfixed, y, dat, hrndm=None, wt=None)[source]¶
lmm_init
preprocesses a dataset prior to fitting a linear mixed effects regression model vialmm_fit()
.Note: this function uses optional algorithmic parameters, see also:
blgm.optset
,blgm.optget
,lmm_fit()
.For full information please refer to the NAG Library document for g02jf
https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g02/g02jff.html
- Parameters
- hlmmHandle, modified in place
On entry: must be set to a null Handle or, alternatively, an existing G22 handle may be supplied in which case
lmm_init
will destroy the supplied G22 handle as ifblgm.handle_free
had been called.On exit: holds a G22 handle to the internal data structure containing a description of the model. You must not change the G22 handle other than through the functions in submodule
correg
or submoduleblgm
.- hddescHandle
A G22 handle to the internal data structure containing a description of the data matrix, , as returned in by
blgm.lm_describe_data
.- hfixedHandle
A G22 handle to the internal data structure containing a description of the fixed part of the model as returned in by
blgm.lm_formula
.If is a null Handle then the model is assumed to not have a fixed part.
- yfloat, array-like, shape
, the vector of observations on the dependent variable.
- datfloat, array-like, shape
The data matrix, . By default, , the th value for the th variable, for , for , should be supplied in .
If the option ‘Storage Order’, described in
blgm.lm_describe_data
, is set to ‘VAROBS’, should be supplied in .If either , or , for a variable used in the model, is NaN (Not A Number) then that value is treated as missing and the whole observation is excluded from the analysis.
- hrndmNone or Handle, list, shape , optional
A series of G22 handles to internal data structures containing a description of the random part of the model as returned in by
blgm.lm_formula
.- wtNone or float, array-like, shape , optional
Optionally, the diagonal elements of the weight matrix .
If , the th observation is not included in the model and the effective number of observations is the number of observations with nonzero weights.
If weights are not provided then must be set to None, and the effective number of observations is .
- Returns
- fnlsvint
The number of levels for the overall subject variable in . If there is no overall subject variable, .
- nffint
The number of fixed effects estimated in each of the subject blocks. The number of columns, , in the design matrix is given by .
- rnlsvint
The number of levels for the overall subject variable in . If there is no overall subject variable, .
- nrfint
The number of random effects estimated in each of the subject blocks. The number of columns, , in the design matrix is given by .
- nvprint
, the number of variance components being estimated (excluding the overall variance, ). This is defined by the number of terms in the random part of the model, (see Algorithmic Details for details).
- commdict, communication object
Communication structure.
- Other Parameters
- ‘Gamma Lower Bound’float
Default
A lower bound for the elements of , where .
- ‘Gamma Upper Bound’float
Default
An upper bound for the elements of , where .
- ‘Initial Distance’float
Default
The initial distance from the solution.
When ,
lmm_fit()
passes ‘Initial Distance’ to the solver as .When , this option is ignored.
- ‘Initial Value Strategy’int
Default
Controls how
lmm_fit()
will choose the initial values for the variance components, , if not supplied.The MIVQUE0 estimates of the variance components based on the likelihood specified by ‘Likelihood’ are used.
The MIVQUE0 estimates based on the maximum likelihood are used, irrespective of the value of ‘Likelihood’.
See Rao (1972) for a description of the minimum variance quadratic unbiased estimators (MIVQUE0).
By default, for small problems, and for large problems .
- ‘Likelihood’str
Default
‘Likelihood’ defines whether
lmm_fit()
will use the restricted maximum likelihood (REML) or the maximum likelihood (ML) when fitting the model.- ‘Linear Minimization Accuracy’float
Default
The accuracy of the linear minimizations.
When ,
lmm_fit()
passes ‘Linear Minimization Accuracy’ to the solver as .When , this option is ignored.
- ‘Line Search Tolerance’float
Default
The line search tolerance.
When , this option is ignored.
When ,
lmm_fit()
passes ‘Line Search Tolerance’ to the solver as ‘Line Search Tolerance’.- ‘List’valueless
Option ‘List’ enables printing of each option specification as it is supplied. ‘NoList’ suppresses this printing.
- ‘NoList’valueless
Default
Option ‘List’ enables printing of each option specification as it is supplied. ‘NoList’ suppresses this printing.
- ‘Major Iteration Limit’int
Default
The number of major iterations.
When ,
lmm_fit()
passes ‘Major Iteration Limit’ to the solver as . In this case, the default value used is .When ,
lmm_fit()
passes ‘Major Iteration Limit’ to the solver as ‘Major Iteration Limit’. In this case, the default value used is , where is the number of variance components being estimated (excluding the overall variance, ).- ‘Major Print Level’int
Default
The frequency that monitoring information is output to ‘Unit Number’.
When ,
lmm_fit()
passes ‘Major Print Level’ to the solver as . In this case, the default value used is and hence no monitoring information will be output.When ,
lmm_fit()
passes ‘Major Print Level’ to the solver as ‘Major Print Level’. In this case, the default value used is and hence no monitoring information will be output.- ‘Maximum Number of Threads’int
Default
Controls the maximum number of threads used by
lmm_fit()
in a multithreaded library. By default, the maximum number of available threads are used.In a library that is not multithreaded, this option has no effect.
- ‘Minor Iteration Limit’int
Default
The number of minor iterations.
When , this option is ignored.
When ,
lmm_fit()
passes ‘Minor Iteration Limit’ to the solver as ‘Minor Iteration Limit’. In this case, the default value used is , where is the number of variance components being estimated (excluding the overall variance, ).- ‘Minor Print Level’int
Default
The frequency that additional monitoring information is output to ‘Unit Number’.
When , this option is ignored.
When ,
lmm_fit()
passes ‘Minor Print Level’ to the solver as ‘Minor Print Level’. The default value of means that no additional monitoring information will be output.- ‘Optimality Tolerance’float
Default
The optimality tolerance.
When , this option is ignored.
When ,
lmm_fit()
passes ‘Optimality Tolerance’ to the solver as ‘Optimality Tolerance’.- ‘Parallelisation Strategy’int
Default
If then ‘Parallelisation Strategy’ controls how
lmm_fit()
is parallelised in a multithreaded library.lmm_fit()
will attempt to parallelise operations involving , even if .lmm_fit()
will only attempt to parallelise operations involving , if .
By default, , however, for some models / datasets, this may be slower than using when .
In a library that is not multithreaded, this option has no effect.
- ‘Solution Accuracy’float
Default
The accuracy to which the solution is required.
When ,
lmm_fit()
passes ‘Solution Accuracy’ to the solver as .When , this option is ignored.
- ‘Solver’str
Default
Controls which solver
lmm_fit()
will use when fitting the model. By default, is used for small problems and , otherwise.If , then the solver used is the one implemented in
opt.bounds_mod_deriv2_comp
and if , then the solver used is the one implemented inopt.nlp1_solve
.- ‘Sweep Tolerance’float
Default
The sweep tolerance used by
lmm_fit()
when performing the sweep operation Wolfinger et al. (1994). The default value used is , where .- ‘Unit Number’int
Default
The monitoring unit number to which
lmm_fit()
will send any monitoring information.
- Raises
- NagValueError
- (errno )
On entry, is not a null Handle or a recognised G22 handle.
- (errno )
has not been initialized or is corrupt.
- (errno )
is not a G22 handle as generated by
blgm.lm_describe_data
.- (errno )
has not been initialized or is corrupt.
- (errno )
is not a G22 handle as generated by
blgm.lm_formula
.- (errno )
A variable name used when creating is not present in .
Variable name: .
- (errno )
On entry, .
Constraint: .
- (errno )
.
has not been initialized or is corrupt.
- (errno )
.
is not a G22 handle as generated by
blgm.lm_formula
.- (errno )
No model has been specified.
- (errno )
A variable name used when creating is not present in .
Variable name: .
- (errno )
On entry, and .
Constraint: or .
- (errno )
On entry, .
Constraint: .
- (errno )
On entry, and .
Constraint: .
- (errno )
On entry, no observations due to zero weights or missing values.
- (errno )
On entry, and .
Constraint: .
- (errno )
On entry, column of the data matrix, , is not consistent with information supplied in , .
- (errno )
On entry, and .
Constraint: .
- (errno )
On entry, and .
Constraint: .
- (errno )
On entry, and .
Constraint: .
- (errno )
On entry, and .
Constraint: .
- Warns
- NagAlgorithmicWarning
- (errno )
The fixed part of the model contains categorical variables, but no intercept or main effects terms have been requested.
- (errno )
Column of the data matrix, , required rounding more than expected when being treated as a categorical variable, .
- Notes
lmm_init
must be called prior to fitting a linear mixed effects regression model vialmm_fit()
.The model is of the form:
where
is a vector of observations on the dependent variable,
is an design matrix of fixed independent variables,
is a vector of unknown fixed effects,
is an design matrix of random independent variables,
is a vector of length of unknown random effects,
is a vector of length of unknown random errors.
Both and are assumed to have a Gaussian distribution with expectation zero and variance/covariance matrix defined by
where , is the identity matrix and is a diagonal matrix. It is assumed that the random variables, , can be subdivided into groups with each group being identically distributed with expectation zero and variance . The diagonal elements of matrix , therefore, take one of the values , depending on which group the associated random variable belongs to.
The model, therefore, contains three sets of unknowns: the fixed effects , the random effects and a vector of variance components , where .
Case weights can be incorporated into the model by replacing and with and respectively where is a diagonal weight matrix.
The design matrices, and , are constructed from an data matrix, , a description of the fixed independent variables, , and a description of the random independent variables, . See Algorithmic Details for further details.
- References
Rao, C R, 1972, Estimation of variance and covariance components in a linear model, J. Am. Stat. Assoc. (67), 112–115
Wolfinger, R, Tobias, R and Sall, J, 1994, Computing Gaussian likelihoods and their derivatives for general linear mixed models, SIAM Sci. Statist. Comput. (15), 1294–1310