NAG FL Interface
g22ydf (lm_​submodel)

Note: please be advised that this routine is classed as ‘experimental’ and its interface may be developed further in the future. Please see Section 4 in How to Use the NAG Library for further information.

1 Purpose

g22ydf produces labels for the columns of a design matrix, model parameters and a vector of column inclusion flags suitable for use with routines in Chapter G02. Thus allowing for submodels to be fit using the same design matrix.

2 Specification

Fortran Interface
Subroutine g22ydf ( what, hform, hxdesc, intcpt, ip, lisx, isx, lplab, plab, lvinfo, vinfo, ifail)
Integer, Intent (In) :: lisx, lplab, lvinfo
Integer, Intent (Inout) :: ifail
Integer, Intent (Out) :: ip, isx(lisx), vinfo(lvinfo)
Character (*), Intent (In) :: what
Character (*), Intent (Out) :: intcpt, plab(lplab)
Type (c_ptr), Intent (In) :: hform, hxdesc
C Header Interface
#include <nag.h>
void  g22ydf_ (const char *what, void **hform, void **hxdesc, char *intcpt, Integer *ip, const Integer *lisx, Integer isx[], const Integer *lplab, char plab[], const Integer *lvinfo, Integer vinfo[], Integer *ifail, const Charlen length_what, const Charlen length_intcpt, const Charlen length_plab)
The routine may be called by the names g22ydf or nagf_blgm_lm_submodel.

3 Description

g22ydf is a utility routine for use with g22yaf, g22ybf and g22ycf. It can be used to construct labels for the columns for an n×mx design matrix, X, created by g22ycf and return additional input vectors and flags required by a number of NAG Library model fitting routines.
Many of the analysis routines that require a design matrix to be supplied allow submodels to be defined through the use of a vector of ones or zeros indicating whether a column of X should be included or excluded from the analyses (see for example isx in g02daf or g02gaf). This allows nested models to be fit without having to reconstructed the design matrix for each analysis.
Let M denote a model constructed by g22yaf, D a data matrix as described by g22ybf and X be the corresponding design matrix constructed by g22ycf from M and D. A different model, MS is a submodel of M if each term in MS, including the mean effect (intercept term) is also present in M.
If MS is a submodel of M, you can fit MS to D using a design matrix whose columns are a subset of the columns of X.

4 References

None.

5 Arguments

1: what Character(*) Input
On entry: controls what labels are to be produced:
what='S'
Labels for a submodel are required. The submodel must be supplied in hform.
what='X'
Labels for the design matrix X.
If hxdesc was returned by g02jff in hlmm then X is the design matrix associated with the fixed parameters.
what='Z'
Labels for the design matrix Z.
If hxdesc was returned by g02jff in hlmm then Z is the part of the design matrix associated with the random parameters.
what='V'
Labels for the variance components.
Constraints:
  • if hxdesc was returned by g02jff in hlmm, what='X', 'Z' or 'V';
  • otherwise what='S' or 'X'.
2: hform Type (c_ptr) Input
On entry: a G22 handle to the internal data structure containing a description of the required submodel MS, as returned in hform by g22yaf. If what'S' hform is not referenced and need not be set.
3: hxdesc Type (c_ptr) Input
On entry: a G22 handle to the internal data structure containing a description of the design matrix, D.
Constraint: a G22 handle as returned by g22ycf in hxdesc or by g02jff and g02jgf in hlmm..
4: intcpt Character(*) Output
On exit: if intcpt='M', in order to fit the model MS to D using X, any analysis routine should include an implicit mean effect (intercept term).
intcpt='Z', if MS does not include a mean effect or the mean effect has been explicitly included in the design matrix.
5: ip Integer Output
On exit: p, the number of parameters in the (sub)model, including the intercept if one is present. If what='S', then the submodel is the one specified in hform otherwise the model is the one used when defining the design matrix described in hxdesc.
If lisx0, if intcpt='Z', p=i=1mxisxi, otherwise p=i=1mxisxi+1.
6: lisx Integer Input
On entry: length of isx.
Constraint: lisx=0 or lisxmx, where mx is the number of columns in the design matrix X.
7: isxlisx Integer array Output
On exit: if lisx0, an array indicating which columns of the design matrix form the model specified in hform.
isxj=0
The jth column of the design matrix, X, should not be included in the analysis.
isxj=1
The jth column of the design matrix, X, should be included in the analysis.
If lisx=0, isx is not referenced.
8: lplab Integer Input
On entry: the length of plab.
As pmx+1, if labels are required, using lplab=mx+1 will always be sufficient.
Constraint: lplab=0 or lplabp.
9: plablplab Character(*) array Output
On exit: if lplab0, the names associated with the p parameters in the model.
If intcpt='Z', the labels in plab are also the labels for the columns of design matrix used in the analysis.
If intcpt='M', columns plab2 to plabp are the corresponding column labels.
If a mean effect is present in MS, the corresponding label is always in plab1.
If lplab=0, plab is not referenced.
10: lvinfo Integer Input
On entry: the length of vinfo.
Let nT denote the number of terms in MS, nTt denote the number of variables in the tth term and mxt denote the number of columns of X corresponding to the tth term. The required size of vinfo, denoted a is given by:
a= t=1 nT mxt1+3nTt.  
If the model includes a mean effect, a should be incremented by one.
The values nT, nTt and mxt are not trivial to calculate as they require the formula describing the model to be fully expanded and the contrast / dummy variable encoding to be known. Therefore, if lisx, lplab or lvinfo are too small and lvinfo3, ifail=102 is returned and the required sizes for these arrays are returned in vinfo1, vinfo2 and vinfo3 respectively.
Constraint: lvinfo=0 or lvinfoa.
11: vinfolvinfo Integer array Output
On exit: if lvinfo0, information encoding a description of the parameters in the model.
The encoding information can be extracted as follows:
  1. (i)Set k=1.
  2. (ii)Iterate j from 1 to p.
    1. 1.Set b=vinfok.
    2. 2.Increment k.
    3. 3.Iterate i from 1 to b.
      1. (a)Set vi=vinfok.
      2. (b)Set li=vinfok+1.
      3. (c)Set ci=vinfok+2.
      4. (d)Increment k by 3.
    4. 4.The jth model parameter corresponds to the interaction between the b variables held in columns v1,v2,,vb of D. Therefore, b=1 indicates a main effect, b=2 a two-way interaction, etc..
      If b=0, the jth model parameter corresponds to the mean effect.
      If li=0, the corresponding variable vi is binary, ordinal or continuous. Otherwise, li is the level for the corresponding variable for model parameter j.
      ci is a numeric flag indicating the contrast used in the case of a categorical variable. With ci=0 indicating that dummy variables were used for variable vi in this term. The remaining six types of contrast; treatment contrasts (with respect to the first and last levels), sum contrasts (with respect to the first and last levels), Helmert contrasts and polynomial contrasts, as described in g22ycf, are identified by the integers one to six respectively.
If lvinfo=0, vinfo is not referenced.
12: ifail Integer Input/Output
On entry: ifail must be set to 0, -1 or 1. If you are unfamiliar with this argument you should refer to Section 4 in the Introduction to the NAG Library FL Interface for details.
For environments where it might be inappropriate to halt program execution when an error is detected, the value -1 or 1 is recommended. If the output of error messages is undesirable, then the value 1 is recommended. Otherwise, if you are not familiar with this argument, the recommended value is 0. When the value -1 or 1 is used it is essential to test the value of ifail on exit.
On exit: ifail=0 unless the routine detects an error or a warning has been flagged (see Section 6).

6 Error Indicators and Warnings

If on entry ifail=0 or -1, explanatory error messages are output on the current error message unit (as defined by x04aaf).
Errors or warnings detected by the routine:
ifail=11
On entry, what=value was an illegal value.
ifail=12
Supplied value of what is not valid for the G22 handle supplied in hxdesc.
ifail=21
hform has not been initialized or is corrupt.
ifail=22
hform is not a G22 handle as generated by g22yaf.
ifail=23
A variable name used when creating hform is not present in hxdesc.
Variable name: value.
ifail=24
The model and the design matrix are not consistent. The design matrix was constructed in the presence of a mean effect and the model does not include a mean effect.
ifail=25
The model and the design matrix are not consistent. The model includes a term not present in the design matrix.
Term: value.
ifail=26
The model and the design matrix are not consistent.
Term: value.
This is likely due to the design matrix being constructed in the presence of either a mean effect or main effect that is not present in the model.
ifail=27
The model and the design matrix are not consistent. The model specifies different contrasts to those used when the design matrix was constructed. The contrasts specified in hform will be ignored.
ifail=31
hxdesc has not been initialized or is corrupt.
ifail=32
hxdesc is not a G22 handle as generated by g22ycf.
ifail=33
hxdesc has not passed through the model fitting routine. The information returned by this routine may not be consistent with results returned from the model fitting routine if the data has been updated after the creation of hxdesc.
ifail=61
On entry, lisx=value and mx=value.
Constraint: lisx=0 or lisxmx.
ifail=81
On entry, lplab=value and p=value.
Constraint: lplab=0 or lplabp.
ifail=91
On entry, plab is too short to hold the parameter labels. Long labels will be truncated.
The longest parameter label is value.
ifail=101
On entry, lvinfo is too small.
lvinfo=value.
Constraint: lvinfo=0 or lvinfovalue.
ifail=102
On entry, one or more of lisx, lplab or lvinfo are nonzero, but too small.
Minimum values are zero, or value, value and value respectively.
The minimum values are returned in the first three elements of vinfo.
ifail=-99
An unexpected error has been triggered by this routine. Please contact NAG.
See Section 7 in the Introduction to the NAG Library FL Interface for further information.
ifail=-399
Your licence key may have expired or may not have been installed correctly.
See Section 8 in the Introduction to the NAG Library FL Interface for further information.
ifail=-999
Dynamic memory allocation failed.
See Section 9 in the Introduction to the NAG Library FL Interface for further information.

7 Accuracy

Not applicable.

8 Parallelism and Performance

g22ydf is threaded by NAG for parallel execution in multithreaded implementations of the NAG Library.
Please consult the X06 Chapter Introduction for information on how to control and interrogate the OpenMP environment used within this routine. Please also consult the Users' Note for your implementation for any additional implementation-specific information.

9 Further Comments

None.

10 Example

This example performs a linear regression using g02daf. The linear regression model is defined via a text string which is parsed using g22yaf and the design matrix associated with the model is generated using g22ycf. A submodel is then fit using the same design matrix.
Default parameter labels, as returned in plab are used for both models. An example of using the information returned in vinfo to construct more verbose parameter labels is given in Section 10 in g22ybf.
See also the examples for g22yaf and g22ycf.

10.1 Program Text

Program Text (g22ydfe.f90)

10.2 Program Data

Program Data (g22ydfe.d)

10.3 Program Results

Program Results (g22ydfe.r)