naginterfaces.library.correg.mixeff_hier_reml¶

naginterfaces.library.correg.mixeff_hier_reml(vpr, gamma, comm, iopt=None, ropt=None, io_manager=None)[source]¶

mixeff_hier_reml fits a multi-level linear mixed effects regression model using restricted maximum likelihood (REML). Prior to calling mixeff_hier_reml the initialization function mixeff_hier_init() must be called.

Deprecated since version 27.0.0.0: mixeff_hier_reml is deprecated. Please use lmm_fit() instead. See also the Replacement Calls document.

For full information please refer to the NAG Library document for g02jd

https://support.nag.com/numeric/nl/nagdoc_30.3/flhtml/g02/g02jdf.html

Parameters

vprint, array-like, shape $(lvpr)$

A vector of flags indicating the mapping between the random variables specified in $rndm$ and the variance components, $σ_{i}^{2}$ . See Further Comments for more details.

gammafloat, array-like, shape $(nvpr + 1)$

Holds the initial values of the variance components, $γ_{0}$ , with $g a m m a [i - 1]$ the initial value for $σ_{i}^{2} / σ_{R}^{2}$ , for $i = 1, 2, \dots, nvpr$ .

If $g a m m a [0] = - 1.0$ , the remaining elements of $g a m m a$ are ignored and the initial values for the variance components are estimated from the data using MIVQUE0.

commdict, communication object, modified in place

Communication structure.

This argument must have been initialized by a prior call to mixeff_hier_init().

ioptNone or int, array-like, shape $(liopt)$ , optional

Options passed to the optimization function.

By default mixeff_hier_reml fits the specified model using a modified Newton optimization algorithm as implemented in opt.bounds_mod_deriv2_comp. In some cases, where the calculation of the derivatives is computationally expensive it may be more efficient to use a sequential QP algorithm.

The sequential QP algorithm as implemented in opt.nlp1_solve can be chosen by setting $i o p t [4] = 1$ .

If $liopt < 5$ or $i o p t [4] \neq 1$ , then the modified Newton algorithm will be used.

Different options are available depending on the optimization function used.

In all cases, using a value of $- 1$ will cause the default value to be used.

In addition only the first $liopt$ values of $i o p t$ are used, so for example, if only the first element of $i o p t$ needs changing and default values for all other options are sufficient $liopt$ can be set to $1$ .

The following table lists the association between elements of $i o p t$ and arguments in the optimizer when the modified Newton algorithm is being used.

$i$	Description	Equivalent argument	Default Value
$0$	Number of iterations	$maxcal$	$1000$
$1$	Unit number for monitoring information	n/a	The advisory unit number
$2$	Print options ( $1 =$ print)	n/a	$- 1$ (no printing performed)
$3$	Frequency that monitoring information is printed	$iprint$	$- 1$
$4$	Optimizer used	n/a	n/a

If requested, monitoring information is displayed in a similar format to that given by the modified Newton optimizer.

The following table lists the association between elements of $i o p t$ and options in the optimizer when the sequential QP algorithm is being used.

$i$	Description	Equivalent option	Default Value
$0$	Number of iterations	‘Major Iteration Limit’	$m a x (50, 3 \times nvpr)$
$1$	Unit number for monitoring information	n/a	The advisory unit number
$2$	Print options ( $1 =$ print, otherwise no print)	‘List’/’Nolist’	$- 1$ (no printing performed)
$3$	Frequency that monitoring information is printed	‘Major Print Level’	$0$
$4$	Optimizer used	n/a	n/a
$5$	Number of minor iterations	‘Minor Iteration Limit’	$m a x (50, 3 \times nvpr)$
$6$	Frequency that additional monitoring information is printed	‘Minor Print Level’	$0$

If $liopt \leq 0$ , default values are used for all options and $i o p t$ is not referenced.

roptNone or float, array-like, shape $(lropt)$ , optional

Options passed to the optimization function.

Different options are available depending on the optimization function used.

In all cases, using a value of $- 1.0$ will cause the default value to be used.

In addition only the first $lropt$ values of $r o p t$ are used, so for example, if only the first element of $r o p t$ needs changing and default values for all other options are sufficient $lropt$ can be set to $1$ .

The following table lists the association between elements of $r o p t$ and arguments in the optimizer when the modified Newton algorithm is being used.

$i$	Description	Equivalent argument	Default Value
$0$	Sweep tolerance	n/a	$m a x (\sqrt{e p s}, \sqrt{e p s} \times {m a x}_{i} ({zz}_{i i}))$
$1$	Lower bound for $γ^{*}$	n/a	$e p s / 100$
$2$	Upper bound for $γ^{*}$	n/a	$10^{20}$
$3$	Accuracy of linear minimizations	$eta$	$0.9$
$4$	Accuracy to which solution is required	$xtol$	$0.0$
$5$	Initial distance from solution	$stepmx$	$100000.0$

The following table lists the association between elements of $r o p t$ and options in the optimizer when the sequential QP algorithm is being used.

$i$	Description	Equivalent option	Default Value
$0$	Sweep tolerance	n/a	$m a x (\sqrt{e p s}, \sqrt{e p s} \times {m a x}_{i} ({zz}_{i i}))$
$1$	Lower bound for $γ^{*}$	n/a	$e p s / 100$
$2$	Upper bound for $γ^{*}$	n/a	$10^{20}$
$3$	Line search tolerance	‘Line Search Tolerance’	$0.9$
$4$	Optimality tolerance	‘Optimality Tolerance’	${e p s}^{0.72}$

where $e p s$ is the machine precision returned by machine.precision and ${zz}_{i i}$ denotes the $i$ diagonal element of $Z^{T} Z$ .

If $lropt \leq 0$ , then default values are used for all options and $r o p t$ may be set to None.

io_managerFileObjManager, optional

Manager for I/O in this routine.

Returns

gammafloat, ndarray, shape $(nvpr + 1)$

$g a m m a [i - 1]$ , for $i = 1, 2, \dots, nvpr$ , holds the final estimate of $σ_{i}^{2}$ and $g a m m a [nvpr]$ holds the final estimate for $σ_{R}^{2}$ .

effnint

Effective number of observations. If no weights were supplied to mixeff_hier_init() or all supplied weights were nonzero, $e f f n = n$ .

rnkxint

The rank of the design matrix, $X$ , for the fixed effects.

ncovint

Number of variance components not estimated to be zero. If none of the variance components are estimated to be zero, then $n c o v = nvpr$ .

lnlikefloat

$- 2 l_{R} (^γ)$ where $l_{R}$ is the log of the restricted maximum likelihood calculated at $^γ$ , the estimated variance components returned in $g a m m a$ .

estidint, ndarray, shape $(:, lb)$

An array describing the parameter estimates returned in $b$ . The first $nlsv \times nrf$ columns of $e s t i d$ describe the parameter estimates for the random effects and the last $nff$ columns the parameter estimates for the fixed effects.

For fixed effects:

for $l = nrf \times nlsv + 1, \dots, nrf \times nlsv + nff$

if $b [l - 1]$ contains the parameter estimate for the intercept, then

$e s t i d [0, l - 1] = e s t i d [1, l - 1] = e s t i d [2, l - 1] = 0;$

if $b [l - 1]$ contains the parameter estimate for the $i$ th level of the $j$ th fixed variable, that is the vector of values held in the $k$ th column of $dat$ when $fixed [j + 1] = k$ then

$\begin{matrix} \begin{matrix} e s t i d [0, l - 1] = 0, e s t i d [1, l - 1] = j, e s t i d [2, l - 1] = i; \end{matrix} \end{matrix}$

if the $j$ th variable is continuous or binary, that is $levels [fixed [j + 1] - 1] = 1$ , $e s t i d [2, l - 1] = 0$ ;

any remaining rows of the $l$ th column of $e s t i d$ are set to $0$ .

For random effects:

let

$N_{R_{b}}$ denote the number of random variables in the $b$ th random statement, that is $N_{R_{b}} = rndm (1, b)$ ;

$R_{j b}$ denote the $j$ th random variable from the $b$ th random statement, that is the vector of values held in the $k$ th column of $dat$ when $rndm (2 + j, b) = k$ ;

$N_{S_{b}}$ denote the number of subject variables in the $b$ th random statement, that is $N_{S_{b}} = rndm (3 + N_{R_{b}}, b)$ ;

$S_{j b}$ denote the $j$ th subject variable from the $b$ th random statement, that is the vector of values held in the $k$ th column of $dat$ when $rndm (3 + N_{R_{b}} + j, b) = k$ ;

$L (S_{j b})$ denote the number of levels for $S_{j b}$ , that is $L (S_{j b}) = levels [rndm (3 + N_{R_{b}} + j, b) - 1]$ ;

then

for $l = 1, 2, \dots nrf \times nlsv$ , if $b [l - 1]$ contains the parameter estimate for the $i$ th level of $R_{j b}$ when $S_{k b} = s_{k}$ , for $k = 1, 2, \dots, N_{S_{b}}$ and $1 \leq s_{k} \leq L (S_{j b})$ , i.e., $s_{k}$ is a valid value for the $k$ th subject variable, then

$\begin{matrix} \begin{matrix} e s t i d [0, l - 1] = b, e s t i d [1, l - 1] = j, e s t i d [2, l - 1] = i, e s t i d [3 + k - 1, l - 1] = s_{k}, k = 1, 2, \dots, N_{S_{b}}; \end{matrix} \end{matrix}$

if the parameter being estimated is for the intercept, then $e s t i d [1, l - 1] = e s t i d [2, l - 1] = 0$ ;

if the $j$ th variable is continuous, or binary, that is $L (S_{j b}) = 1$ , then $e s t i d [2, l - 1] = 0$ ;

the remaining rows of the $l$ th column of $e s t i d$ are set to $0$ .

In some situations, certain combinations of variables are never observed.

In such circumstances all elements of the $l$ th row of $e s t i d$ are set to $- 999$ .

bfloat, ndarray, shape $(nff + nrf \times nlsv)$

The parameter estimates, with the first $nrf \times nlsv$ elements of $b$ containing the parameter estimates for the random effects, $ν$ , and the remaining $nff$ elements containing the parameter estimates for the fixed effects, $β$ . The order of these estimates are described by the $e s t i d$ argument.

sefloat, ndarray, shape $(nff + nrf \times nlsv)$

The standard errors of the parameter estimates given in $b$ .

czzfloat, ndarray, shape $(nrf, :)$

If $nlsv = 1$ , then $c z z$ holds the lower triangular portion of the matrix $(1 / σ^{2}) (Z^{T} {^R}^{- 1} Z + {^G}^{- 1})$ , where $^R$ and $^G$ are the estimates of $R$ and $G$ respectively. If $nlsv > 1$ , then $c z z$ holds this matrix in compressed form, with the first $nrf$ columns holding the part of the matrix corresponding to the first level of the overall subject variable, the next $nrf$ columns the part corresponding to the second level of the overall subject variable etc.

cxxfloat, ndarray, shape $(nff, :)$

$c x x$ holds the lower triangular portion of the matrix $(1 / σ^{2}) X^{T} {^V}^{- 1} X$ , where $^V$ is the estimated value of $V$ .

cxzfloat, ndarray, shape $(nff, :)$

If $nlsv = 1$ , then $c x z$ holds the matrix $(1 / σ^{2}) (X^{T} {^V}^{- 1} Z)^G$ , where $^V$ and $^G$ are the estimates of $V$ and $G$ respectively. If $nlsv > 1$ , then $c x z$ holds this matrix in compressed form, with the first $nrf$ columns holding the part of the matrix corresponding to the first level of the overall subject variable, the next $nrf$ columns the part corresponding to the second level of the overall subject variable etc.

Raises

NagValueError

(errno $1$ )

On entry, $lvpr = ⟨ v a l u e ⟩$ .

Constraint: $lvpr \geq ⟨ v a l u e ⟩$ .

(errno $2$ )

On entry, $v p r [⟨ v a l u e ⟩] = ⟨ v a l u e ⟩$ and $nvpr = ⟨ v a l u e ⟩$ .

Constraint: $1 \leq v p r [i] \leq nvpr$ .

(errno $3$ )

On entry, $nvpr = ⟨ v a l u e ⟩$ .

Constraint: $1 \leq nvpr \leq ⟨ v a l u e ⟩$ .

(errno $4$ )

On entry, $g a m m a [⟨ v a l u e ⟩] = ⟨ v a l u e ⟩$ .

Constraint: $g a m m a [0] = - 1.0$ or $g a m m a [i - 1] \geq 0.0$ .

(errno $9$ )

On entry, $lb = ⟨ v a l u e ⟩$ .

Constraint: $lb \geq ⟨ v a l u e ⟩$ .

(errno $11$ )

On entry, $ldid = ⟨ v a l u e ⟩$ .

Constraint: $ldid \geq ⟨ v a l u e ⟩$ .

(errno $21$ )

On entry, $c o m m$ [‘iopts’] has not been initialized correctly.

(errno $32$ )

On entry, at least one value of i, for $i = 1, 2, \dots, nvpr$ , does not appear in $v p r$ .

Warns

NagAlgorithmicWarning

(errno $101$ ): Optimal solution found, but requested accuracy not achieved.
(errno $102$ ): Too many major iterations.
(errno $103$ ): Current point cannot be improved upon.
(errno $104$ ): At least one negative estimate for $g a m m a$ was obtained. All negative estimates have been set to zero.

Notes

mixeff_hier_reml fits a model of the form:

y = X β + Z ν + ϵ

where	$y$ is a vector of $n$ observations on the dependent variable,
	$X$ is a known $n \times p$ design matrix for the fixed independent variables,
	$β$ is a vector of length $p$ of unknown fixed effects,
	$Z$ is a known $n \times q$ design matrix for the random independent variables,
	$ν$ is a vector of length $q$ of unknown random effects,
and	$ϵ$ is a vector of length $n$ of unknown random errors.

Both $ν$ and $ϵ$ are assumed to have a Gaussian distribution with expectation zero and variance/covariance matrix defined by

\begin{matrix} V a r [\begin{matrix} ν ϵ \end{matrix}] = [\begin{matrix} G & 0 0 & R \end{matrix}] \end{matrix}

where $R = σ_{R}^{2} I$ , $I$ is the $n \times n$ identity matrix and $G$ is a diagonal matrix. It is assumed that the random variables, $Z$ , can be subdivided into $g \leq q$ groups with each group being identically distributed with expectation zero and variance $σ_{i}^{2}$ . The diagonal elements of matrix $G$ , therefore, take one of the values ${σ_{i}^{2} : i = 1, 2, \dots, g}$ , depending on which group the associated random variable belongs to.

The model, therefore, contains three sets of unknowns: the fixed effects $β$ , the random effects $ν$ and a vector of $g + 1$ variance components $γ$ , where $γ = {σ_{1}^{2}, σ_{2}^{2}, \dots, σ_{g - 1}^{2}, σ_{g}^{2}, σ_{R}^{2}}$ . Rather than working directly with $γ$ , mixeff_hier_reml uses an iterative process to estimate $γ^{*} = {σ_{1}^{2} / σ_{R}^{2}, σ_{2}^{2} / σ_{R}^{2}, \dots, σ_{g - 1}^{2} / σ_{R}^{2}, σ_{g}^{2} / σ_{R}^{2}, 1}$ . Due to the iterative nature of the estimation a set of initial values, $γ_{0}$ , for $γ^{*}$ is required. mixeff_hier_reml allows these initial values either to be supplied by you or calculated from the data using the minimum variance quadratic unbiased estimators (MIVQUE0) suggested by Rao (1972).

mixeff_hier_reml fits the model by maximizing the restricted log-likelihood function:

- 2 l_{R} = log (| V |) + (n - p) log (r^{T} V^{- 1} r) + log (∣ ∣ X^{T} V^{- 1} X ∣ ∣) + (n - p) (1 + log (2 π / (n - p)))

where

V = Z G Z^{T} + R, r = y - X b and b = {(X^{T} V^{- 1} X)}^{- 1} X^{T} V^{- 1} y .

Once the final estimates for $γ^{*}$ have been obtained, the value of $σ_{R}^{2}$ is given by

σ_{R}^{2} = (r^{T} V^{- 1} r) / (n - p) .

Case weights, $W_{c}$ , can be incorporated into the model by replacing $X^{T} X$ and $Z^{T} Z$ with $X^{T} W_{c} X$ and $Z^{T} W_{c} Z$ respectively, for a diagonal weight matrix $W_{c}$ .

The log-likelihood, $l_{R}$ , is calculated using the sweep algorithm detailed in Wolfinger et al. (1994).

References

Goodnight, J H, 1979, A tutorial on the SWEEP operator, The American Statistician (33(3)), 149–158

Harville, D A, 1977, Maximum likelihood approaches to variance component estimation and to related problems, JASA (72), 320–340

Rao, C R, 1972, Estimation of variance and covariance components in a linear model, J. Am. Stat. Assoc. (67), 112–115

Stroup, W W, 1989, Predictable functions and prediction space in the mixed model procedure, Applications of Mixed Models in Agriculture and Related Disciplines (Southern Cooperative Series Bulletin No. 343), 39–48

Wolfinger, R, Tobias, R and Sall, J, 1994, Computing Gaussian likelihoods and their derivatives for general linear mixed models, SIAM Sci. Statist. Comput. (15), 1294–1310

NAG and Python

Return to Front

naginterfaces.library.correg.mixeff_hier_reml¶

naginterfaces.library.correg.mixeff_​hier_​reml¶

naginterfaces.library.correg.mixeff_hier_reml¶