NAG Library Routine Document
G13AEF
1 Purpose
G13AEF fits a seasonal autoregressive integrated moving average (ARIMA) model to an observed time series, using a nonlinear least squares procedure incorporating backforecasting. Parameter estimates are obtained, together with appropriate standard errors. The residual series is returned, and information for use in forecasting the time series is produced for use by the routines
G13AGF and
G13AHF.
The estimation procedure is iterative, starting with initial parameter values such as may be obtained using
G13ADF. It continues until a specified convergence criterion is satisfied, or until a specified number of iterations has been carried out. The progress of the procedure can be monitored by means of a user-supplied routine.
2 Specification
SUBROUTINE G13AEF ( |
MR, PAR, NPAR, C, KFC, X, NX, ICOUNT, EX, EXR, AL, IEX, S, G, IGH, SD, H, LDH, ST, IST, NST, PIV, KPIV, NIT, ITC, ZSP, KZSP, ISF, WA, IWA, HC, IFAIL) |
INTEGER |
MR(7), NPAR, KFC, NX, ICOUNT(6), IEX, IGH, LDH, IST, NST, KPIV, NIT, ITC, KZSP, ISF(4), IWA, IFAIL |
REAL (KIND=nag_wp) |
PAR(NPAR), C, X(NX), EX(IEX), EXR(IEX), AL(IEX), S, G(IGH), SD(IGH), H(LDH,IGH), ST(IST), ZSP(4), WA(IWA), HC(LDH,IGH) |
EXTERNAL |
PIV |
|
3 Description
The time series
supplied to G13AEF is assumed to follow a seasonal autoregressive integrated moving average (ARIMA) model defined as follows:
where
is the result of applying non-seasonal differencing of order
and seasonal differencing of seasonality
and order
to the series
, as outlined in the description of
G13AAF. The differenced series is then of length
, where
is the generalized order of differencing. The scalar
is the expected value of the differenced series, and the series
follows a zero-mean stationary autoregressive moving average (ARMA) model defined by a pair of recurrence equations. These express
in terms of an uncorrelated series
, via an intermediate series
. The first equation describes the seasonal structure:
The second equation describes the non-seasonal structure. If the model is purely non-seasonal the first equation is redundant and
above is equated with
:
Estimates of the model parameters defined by
and (optionally)
are obtained by minimizing a quadratic form in the vector
.
This is
, where
is the covariance matrix of
, and is a function of the model parameters. This matrix is not explicitly evaluated, since
may be expressed as a ‘sum of squares’ function. When moving average parameters
or
are present, so that the generalized moving average order
is positive, backforecasts
are introduced as nuisance parameters. The ‘sum of squares’ function may then be written as
where
is a combined vector of parameters, consisting of the backforecasts followed by the ARMA model parameters.
The terms correspond to the ARMA model residual series , and is the generalized autoregressive order. The terms are only present if autoregressive parameters are in the model, and serve to correct for transient errors introduced at the start of the autoregression.
The equations defining
and
are precisely:
- ,
for . - ,
for . - ,
for - ,
for .
For all four of these equations, the following conditions hold:
- if
- if
- if
- if
- if
Minimization of
with respect to
uses an extension of the algorithm of
Marquardt (1963).
The first derivatives of
with respect to the parameters are calculated as
where
and
are derivatives of
and
with respect to the
th parameter.
The second derivative of
is approximated by
Successive parameter iterates are obtained by calculating a vector of corrections
by solving the equations
where
is a vector with elements
,
is a matrix with elements
,
is a scalar used to control the search and
is the diagonal matrix of
.
The new parameter values are then .
The scalar controls the step size, to which it is inversely related.
If a step results in new parameter values which give a reduced value of , then is reduced by a factor . If a step results in new parameter values which give an increased value of , or in ARMA model parameters which in any way contravene the stationarity and invertibility conditions, then the new parameters are rejected, is increased by the factor , and the revised equations are solved for a new parameter correction.
This action is repeated until either a reduced value of is obtained, or reaches the limit of , which is used to indicate a failure of the search procedure.
This failure may be due to a badly conditioned sum of squares function or to too strict a convergence criterion. Convergence is deemed to have occurred if the fractional reduction in the residual sum of squares in successive iterations is less than a value , while .
The stationarity and invertibility conditions are tested to within a specified tolerance multiple of machine accuracy. Upon convergence, or completion of the specified maximum number of iterations without convergence, statistical properties of the estimates are derived. In the latter case the sequence of iterates should be checked to ensure that convergence is adequate for practical purposes, otherwise these properties are not reliable.
The estimated residual variance is
where
is the final value of
, and the residual number of degrees of freedom is given by
The covariance matrix of the vector of estimates
is given by
where
is evaluated at the final parameter values.
From this expression are derived the vector of standard deviations, and the correlation matrix for the whole parameter set. These are asymptotic approximations.
The differenced series
(now uncorrected for the constant), intermediate series
and residual series
are all available upon completion of the iterations over the range (extended by backforecasts)
The values
can only properly be interpreted as residuals for
, as the earlier values are corrupted by transients if
.
In consequence of the manner in which differencing is implemented, the residual is the one step ahead forecast error for .
For convenient application in forecasting, the following quantities constitute the ‘state set’, which contains the minimum amount of time series information needed to construct forecasts:
(i) |
the differenced series , for , |
(ii) |
the values required to reconstitute the original series from the differenced series , |
(iii) |
the intermediate series , for
, |
(iv) |
the residual series , for . |
This state set is available upon completion of the iterations. The routine may be used purely for the construction of this state set, given a previously estimated model and time series
, by requesting zero iterations. Backforecasts are estimated, but the model parameter values are unchanged. If later observations become available and it is desired to update the state set,
G13AGF can be used.
4 References
Box G E P and Jenkins G M (1976) Time Series Analysis: Forecasting and Control (Revised Edition) Holden–Day
Marquardt D W (1963) An algorithm for least-squares estimation of nonlinear parameters J. Soc. Indust. Appl. Math. 11 431
5 Parameters
- 1: MR() – INTEGER arrayInput
On entry: the orders vector of the ARIMA model whose parameters are to be estimated. , , and refer respectively to the number of autoregressive (), moving average , seasonal autoregressive () and seasonal moving average () parameters. , and refer respectively to the order of non-seasonal differencing, the order of seasonal differencing and the seasonal period.
Constraints:
- , , , , , , ;
- ;
- ;
- if , ;
- if , ;
- ;
- .
- 2: PAR(NPAR) – REAL (KIND=nag_wp) arrayInput/Output
On entry: the initial estimates of the values of the parameters, the values of the parameters, the values of the parameters and the values of the parameters, in that order.
On exit: the latest values of the estimates of these parameters.
- 3: NPAR – INTEGERInput
On entry: the total number of , , and parameters to be estimated.
Constraint:
.
- 4: C – REAL (KIND=nag_wp)Input/Output
On entry: if
,
C must contain the expected value,
, of the differenced series.
If
,
C must contain an initial estimate of
.
On exit: if
,
C is unchanged.
If
,
C contains the latest estimate of
.
Therefore, if
C and
KFC are both zero on entry, there is no constant correction.
- 5: KFC – INTEGERInput
On entry: must be set to if the constant, , is to be estimated and if it is to be held fixed at its initial value.
Constraint:
or .
- 6: X(NX) – REAL (KIND=nag_wp) arrayInput
On entry: the values of the original undifferenced time series.
- 7: NX – INTEGERInput
On entry: , the length of the original undifferenced time series.
- 8: ICOUNT() – INTEGER arrayOutput
On exit: size of various output arrays.
- Contains , the number of backforecasts.
- Contains , the number of differenced values.
- Contains , the number of values of reconstitution information.
- Contains , the number of values held in each of the series EX, EXR and AL.
- Contains , the number of degrees of freedom associated with .
- Contains , the number of parameters being estimated.
These values are always computed regardless of the exit value of
IFAIL.
- 9: EX(IEX) – REAL (KIND=nag_wp) arrayOutput
On exit: the extended differenced series which is made up of:
backforecast values of the differenced series.
actual values of the differenced series.
values of reconstitution information.
The total number of these values held in
EX is
.
If the routine exits because of a faulty input parameter, the contents of
EX will be indeterminate.
- 10: EXR(IEX) – REAL (KIND=nag_wp) arrayOutput
On exit: the values of the model residuals which is made up of:
residuals corresponding to the backforecasts in the differenced series.
residuals corresponding to the actual values in the differenced series.
The remaining values contain zeros.
If the routine exits with
IFAIL holding a value other than
or
, the contents of
EXR will be indeterminate.
- 11: AL(IEX) – REAL (KIND=nag_wp) arrayOutput
On exit: the intermediate series which is made up of:
intermediate series values corresponding to the backforecasts in the differenced series.
intermediate series values corresponding to the actual values in the differenced series.
The remaining values contain zeros.
If the routine exits with
, the contents of
AL will be indeterminate.
- 12: IEX – INTEGERInput
On entry: the dimension of the arrays
EX,
EXR and
AL as declared in the (sub)program from which G13AEF is called.
Constraint:
, which is equivalent to the exit value of .
- 13: S – REAL (KIND=nag_wp)Output
On exit: the residual sum of squares after the latest series of parameter estimates has been incorporated into the model. If the routine exits with a faulty input parameter,
S contains zero.
- 14: G(IGH) – REAL (KIND=nag_wp) arrayOutput
On exit: the latest value of the derivatives of
with respect to each of the parameters being estimated (backforecasts,
PAR parameters, and where relevant the constant – in that order). The contents of
G will be indeterminate if the routine exits with a faulty input parameter.
- 15: IGH – INTEGERInput
On entry: the dimension of the arrays
G and
SD and the second dimension of the arrays
H and
HC as declared in the (sub)program from which G13AEF is called.
Constraint:
which is equivalent to the exit value of .
- 16: SD(IGH) – REAL (KIND=nag_wp) arrayOutput
On exit: the standard deviations corresponding to each of the parameters being estimated (backforecasts,
PAR parameters, and where relevant the constant, in that order).
If the routine exits with
IFAIL containing a value other than
or
, or if the required number of iterations is zero, the contents of
SD will be indeterminate.
- 17: H(LDH,IGH) – REAL (KIND=nag_wp) arrayOutput
On exit: the second derivative of
and correlation coefficients.
(a) |
the latest values of an approximation to the second derivative of with respect to each of the parameters being estimated (backforecasts, PAR parameters, and where relevant the constant – in that order), and |
(b) |
the correlation coefficients relating to each pair of these parameters. |
These are held in a matrix defined by the first
rows and the first
columns of
H. (Note that
contains the value of this expression.) The values of
(a) are contained in the upper triangle, and the values of
(b) in the strictly lower triangle.
These correlation coefficients are zero during intermediate printout using
PIV, and indeterminate if
IFAIL contains on exit a value other than
or
.
All the contents of
H are indeterminate if the required number of iterations are zero. The
th row of
H is used internally as workspace.
- 18: LDH – INTEGERInput
On entry: the first dimension of the arrays
H and
HC as declared in the (sub)program from which G13AEF is called.
Constraint:
, which is equivalent to the exit value of .
- 19: ST(IST) – REAL (KIND=nag_wp) arrayOutput
On exit: the
NST values of the state set array. If the routine exits with
IFAIL containing a value other than
or
, the contents of
ST will be indeterminate.
- 20: IST – INTEGERInput
On entry: the dimension of the array
ST as declared in the (sub)program from which G13AEF is called.
Constraint:
.
- 21: NST – INTEGEROutput
On exit: the number of values in the state set array
ST.
- 22: PIV – SUBROUTINE, supplied by the NAG Library or the user.External Procedure
PIV is used to monitor the progress of the optimization.
The specification of
PIV is:
SUBROUTINE PIV ( |
MR, PAR, NPAR, C, KFC, ICOUNT, S, G, H, LDH, IGH, ITC, ZSP) |
INTEGER |
MR(7), NPAR, KFC, ICOUNT(6), LDH, IGH, ITC |
REAL (KIND=nag_wp) |
PAR(NPAR), C, S, G(IGH), H(LDH,IGH), ZSP(4) |
|
PIV is called on each iteration by G13AEF when the input value of
KPIV is nonzero and is bypassed when it is
.
The routine G13AFZ may be used as
PIV. It prints the heading
G13AFZ MONITORING OUTPUT - ITERATION n
followed by the parameter values and the residual sum of squares. Output is directed to the advisory channel defined by
X04ABF.
- 1: MR() – INTEGER arrayInput
- 2: PAR(NPAR) – REAL (KIND=nag_wp) arrayInput
- 3: NPAR – INTEGERInput
- 4: C – REAL (KIND=nag_wp)Input
- 5: KFC – INTEGERInput
- 6: ICOUNT() – INTEGER arrayInput
- 7: S – REAL (KIND=nag_wp)Input
- 8: G(IGH) – REAL (KIND=nag_wp) arrayInput
- 9: H(LDH,IGH) – REAL (KIND=nag_wp) arrayInput
- 10: LDH – INTEGERInput
- 11: IGH – INTEGERInput
- 12: ITC – INTEGERInput
- 13: ZSP() – REAL (KIND=nag_wp) arrayInput
On entry: all the parameters are defined as for G13AEF itself.
PIV must either be a module subprogram USEd by, or declared as EXTERNAL in, the (sub)program from which G13AEF is called. Parameters denoted as
Input must
not be changed by this procedure.
If
a dummy
PIV must be supplied.
- 23: KPIV – INTEGERInput
On entry: must be nonzero if the progress of the optimization is to be monitored using
PIV. Otherwise
KPIV must contain
.
- 24: NIT – INTEGERInput
On entry: the maximum number of iterations to be performed.
Constraint:
.
- 25: ITC – INTEGEROutput
On exit: the number of iterations performed.
- 26: ZSP() – REAL (KIND=nag_wp) arrayInput/Output
On entry: when
, the first four elements of
ZSP must contain the four values used to guide the search procedure. These are as follows.
contains , the value used to constrain the magnitude of the search procedure steps.
contains , the multiplier which regulates the value .
contains , the value of the stationarity and invertibility test tolerance factor.
contains , the value of the convergence criterion.
If
on entry, default values for
ZSP are supplied by the routine.
These are , , and respectively.
On exit:
ZSP contains the values, default or otherwise, used by the routine.
Constraint:
if , , , , .
- 27: KZSP – INTEGERInput
On entry: the value
if the routine is to use the input values of
ZSP, and any other value if the default values of
ZSP are to be used.
- 28: ISF() – INTEGER arrayOutput
On exit: contains success/failure indicators, one for each of the four types of parameter in the model (autoregressive, moving average, seasonal autoregressive, seasonal moving average), in that order.
Each indicator has the interpretation:
|
On entry parameters of this type have initial estimates which do not satisfy the stationarity or invertibility test conditions. |
|
The search procedure has failed to converge because the latest set of parameter estimates of this type is invalid. |
|
No parameter of this type is in the model. |
|
Valid final estimates for parameters of this type have been obtained. |
- 29: WA(IWA) – REAL (KIND=nag_wp) arrayWorkspace
- 30: IWA – INTEGERInput
On entry: the dimension of the array
WA as declared in the (sub)program from which G13AEF is called.
Constraint:
.
Where |
; |
and |
if ; |
|
if , ; |
|
if , , ; |
|
if , , , ; |
|
otherwise. |
- 31: HC(LDH,IGH) – REAL (KIND=nag_wp) arrayWorkspace
- 32: IFAIL – INTEGERInput/Output
-
On entry:
IFAIL must be set to
,
. If you are unfamiliar with this parameter you should refer to
Section 3.3 in the Essential Introduction for details.
For environments where it might be inappropriate to halt program execution when an error is detected, the value
is recommended. If the output of error messages is undesirable, then the value
is recommended. Otherwise, because for this routine the values of the output parameters may be useful even if
on exit, the recommended value is
.
When the value is used it is essential to test the value of IFAIL on exit.
On exit:
unless the routine detects an error or a warning has been flagged (see
Section 6).
6 Error Indicators and Warnings
If on entry
or
, explanatory error messages are output on the current error message unit (as defined by
X04AAF).
Note: G13AEF may return useful information for one or more of the following detected errors or warnings.
Errors or warnings detected by the routine:
On entry, | , |
or | the orders vector MR is invalid (check it against the constraints in Section 5), |
or | or . |
On entry, , i.e., the number of terms in the differenced series is not greater than the number of parameters in the model. The model is over-parameterised.
On entry, | one or more of the user-supplied criteria for controlling the iterative process are invalid, |
or | , |
or | if , ; |
or | if , ; |
or | if , ; |
or | if , ; |
or | if , . |
On entry, the state set array
ST is too small. The output value of
NST contains the required value (see the description of
IST in
Section 5 for the formula).
On entry, the workspace array
WA is too small. Check the value of
IWA against the constraints in
Section 5.
On entry, | , |
or | , |
or | . |
This indicates a failure in the search procedure, with .
Some output parameters may contain meaningful values; see
Section 5 for details.
This indicates a failure to invert .
Some output parameters may contain meaningful values; see
Section 5 for details.
This indicates a failure in
F04ASF which is used to solve the equations giving the latest estimates of the backforecasts.
Some output parameters may contain meaningful values; see
Section 5 for details.
-
Satisfactory parameter estimates could not be obtained for all parameter types in the model. Inspect array
ISF for further information on the parameter type(s) in error.
7 Accuracy
The computations are believed to be stable.
The time taken by G13AEF is approximately proportional to .
9 Example
The following program reads observations from a time series relating to the rate of the earth's rotation about its polar axis. Differencing of order is applied, and the number of non-seasonal parameters is , one autoregressive , and two moving average . No seasonal effects are taken into account.
The constant is estimated. Up to iterations are allowed.
The initial estimates of , , and are zero.
9.1 Program Text
Program Text (g13aefe.f90)
9.2 Program Data
Program Data (g13aefe.d)
9.3 Program Results
Program Results (g13aefe.r)