G02GPF allows prediction from a generalized linear model fit via
G02GAF,
G02GBF,
G02GCF or
G02GDF.
SUBROUTINE G02GPF ( |
ERRFN, LINK, MEAN, OFFSET, WEIGHT, N, X, LDX, M, ISX, IP, T, OFF, WT, S, A, B, COV, VFOBS, ETA, SEETA, PRED, SEPRED, IFAIL) |
INTEGER |
N, LDX, M, ISX(M), IP, IFAIL |
REAL (KIND=nag_wp) |
X(LDX,*), T(*), OFF(*), WT(*), S, A, B(IP), COV(IP*(IP+1)/2), ETA(N), SEETA(N), PRED(N), SEPRED(N) |
LOGICAL |
VFOBS |
CHARACTER(1) |
ERRFN, LINK, MEAN, OFFSET, WEIGHT |
|
A generalized linear model consists of the following elements:
(i) |
A suitable distribution for the dependent variable . |
(ii) |
A linear model, with linear predictor , where is a matrix of independent variables and a column vector of parameters. |
(iii) |
A link function between the expected value of and the linear predictor, that is . |
In order to predict from a generalized linear model, that is estimate a value for the dependent variable,
, given a set of independent variables
, the matrix
must be supplied, along with values for the parameters
and their associated variance-covariance matrix,
. Suitable values for
and
are usually estimated by first fitting the prediction model to a training dataset with known responses, using for example
G02GAF,
G02GBF,
G02GCF or
G02GDF. The predicted variable, and its standard error can then be obtained from:
where
is a vector of offsets and
, if the variance of future observations is not taken into account, and
otherwise. Here
indicates the diagonal elements of matrix
.
If required, the variance for the
th future observation,
, can be calculated as:
where
is a weight,
is the scale (or dispersion) parameter, and
is the variance function. Both the scale parameter and the variance function depend on the distribution used for the
, with:
Poisson |
, |
binomial |
, |
Normal |
|
gamma |
|
- 1: – CHARACTER(1)Input
-
On entry: indicates the distribution used to model the dependent variable,
.
- The binomial distribution is used.
- The gamma distribution is used.
- The Normal (Gaussian) distribution is used.
- The Poisson distribution is used.
Constraint:
, , or .
- 2: – CHARACTER(1)Input
-
On entry: indicates which link function to be used.
- A complementary log-log link is used.
- An exponent link is used.
- A logistic link is used.
- An identity link is used.
- A log link is used.
- A probit link is used.
- A reciprocal link is used.
- A square root link is used.
Details on the functional form of the different links can be found in the
G02 Chapter Introduction.
Constraints:
- if , , or ;
- otherwise , , , or .
- 3: – CHARACTER(1)Input
-
On entry: indicates if a mean term is to be included.
- A mean term, intercept, will be included in the model.
- The model will pass through the origin, zero-point.
Constraint:
or .
- 4: – CHARACTER(1)Input
-
On entry: indicates if an offset is required.
- An offset must be supplied in OFF.
- OFF is not referenced.
Constraint:
or .
- 5: – CHARACTER(1)Input
-
On entry: if
indicates if weights are used, otherwise
WEIGHT is not referenced.
- No weights are used.
- Weights are used and must be supplied in WT.
Constraint:
if , or .
- 6: – INTEGERInput
-
On entry: , the number of observations.
Constraint:
.
- 7: – REAL (KIND=nag_wp) arrayInput
-
Note: the second dimension of the array
X
must be at least
.
On entry: must contain the th observation for the th independent variable, for and .
- 8: – INTEGERInput
-
On entry: the first dimension of the array
X as declared in the (sub)program from which G02GPF is called.
Constraint:
.
- 9: – INTEGERInput
-
On entry: , the total number of independent variables.
Constraint:
.
- 10: – INTEGER arrayInput
-
On entry: indicates which independent variables are to be included in the model.
If , the th independent variable is included in the regression model.
Constraints:
- , for ;
- if , exactly values of ISX must be ;
- if , exactly IP values of ISX must be .
- 11: – INTEGERInput
-
On entry: the number of independent variables in the model, including the mean or intercept if present.
Constraint:
.
- 12: – REAL (KIND=nag_wp) arrayInput
-
Note: the dimension of the array
must be at least
if , and at least otherwise.
On entry: if
,
must contain the binomial denominator,
, for the
th observation.
Otherwise
T is not referenced.
Constraint:
if , , for .
- 13: – REAL (KIND=nag_wp) arrayInput
-
Note: the dimension of the array
must be at least
if , and at least otherwise.
On entry: if
,
must contain the offset
, for the
th observation.
Otherwise
OFF is not referenced.
- 14: – REAL (KIND=nag_wp) arrayInput
-
Note: the dimension of the array
must be at least
if and , and at least otherwise.
On entry: if
and
,
must contain the weight,
, for the
th observation.
If the variance of future observations is not included in the standard error of the predicted variable,
WT is not referenced.
Constraint:
if and , ., for .
- 15: – REAL (KIND=nag_wp)Input
-
On entry: if
or
and
, the scale parameter,
.
Otherwise
S is not referenced and
.
Constraint:
if or and , .
- 16: – REAL (KIND=nag_wp)Input
-
On entry: if
,
A must contain the power of the exponential.
If
,
A is not referenced.
Constraint:
if , .
- 17: – REAL (KIND=nag_wp) arrayInput
-
On entry: the model parameters,
.
If
,
must contain the mean parameter and
the coefficient of the variable contained in the
th independent
X, where
is the
th positive value in the array
ISX.
If
,
must contain the coefficient of the variable contained in the
th independent
X, where
is the
th positive value in the array
ISX.
- 18: – REAL (KIND=nag_wp) arrayInput
-
On entry: the upper triangular part of the variance-covariance matrix, , of the model parameters. This matrix should be supplied packed by column, i.e., the covariance between parameters and , that is the values stored in and , should be supplied in
, for and .
Constraint:
the matrix represented in
COV must be a valid variance-covariance matrix.
- 19: – LOGICALInput
-
On entry: if , the variance of future observations is included in the standard error of the predicted variable (i.e., ), otherwise .
- 20: – REAL (KIND=nag_wp) arrayOutput
-
On exit: the linear predictor, .
- 21: – REAL (KIND=nag_wp) arrayOutput
-
On exit: the standard error of the linear predictor, .
- 22: – REAL (KIND=nag_wp) arrayOutput
-
On exit: the predicted value, .
- 23: – REAL (KIND=nag_wp) arrayOutput
-
On exit: the standard error of the predicted value, . If could not be calculated, then G02GPF returns , and is set to .
- 24: – INTEGERInput/Output
-
On entry:
IFAIL must be set to
,
. If you are unfamiliar with this parameter you should refer to
Section 3.3 in the Essential Introduction for details.
For environments where it might be inappropriate to halt program execution when an error is detected, the value
is recommended. If the output of error messages is undesirable, then the value
is recommended. Otherwise, because for this routine the values of the output parameters may be useful even if
on exit, the recommended value is
.
When the value is used it is essential to test the value of IFAIL on exit.
On exit:
unless the routine detects an error or a warning has been flagged (see
Section 6).
If on entry
or
, explanatory error messages are output on the current error message unit (as defined by
X04AAF).
Not applicable.
G02GPF is not threaded by NAG in any implementation.
G02GPF makes calls to BLAS and/or LAPACK routines, which may be threaded within the vendor library used by this implementation. Consult the documentation for the vendor library for further information.
Please consult the
X06 Chapter Introduction for information on how to control and interrogate the OpenMP environment used within this routine. Please also consult the
Users' Note for your implementation for any additional implementation-specific information.
None.
The model
is fitted to a training dataset with five observations. The resulting model is then used to predict the response for two new observations.