PDF version (NAG web site
, 64-bit version, 64-bit version)
NAG Toolbox: nag_tsa_uni_arima_estim (g13ae)
Purpose
nag_tsa_uni_arima_estim (g13ae) fits a seasonal autoregressive integrated moving average (ARIMA) model to an observed time series, using a nonlinear least squares procedure incorporating backforecasting. Parameter estimates are obtained, together with appropriate standard errors. The residual series is returned, and information for use in forecasting the time series is produced for use by the functions
nag_tsa_uni_arima_update (g13ag) and
nag_tsa_uni_arima_forecast_state (g13ah).
The estimation procedure is iterative, starting with initial parameter values such as may be obtained using
nag_tsa_uni_arima_prelim (g13ad). It continues until a specified convergence criterion is satisfied, or until a specified number of iterations has been carried out. The progress of the procedure can be monitored by means of a user-supplied function.
Syntax
[
par,
c,
icount,
ex,
exr,
al,
s,
g,
sd,
h,
st,
nst,
itc,
zsp,
isf,
ifail] = g13ae(
mr,
par,
c,
x,
iex,
igh,
ist,
piv,
kpiv,
zsp,
kzsp, 'npar',
npar, 'kfc',
kfc, 'nx',
nx, 'nit',
nit)
[
par,
c,
icount,
ex,
exr,
al,
s,
g,
sd,
h,
st,
nst,
itc,
zsp,
isf,
ifail] = nag_tsa_uni_arima_estim(
mr,
par,
c,
x,
iex,
igh,
ist,
piv,
kpiv,
zsp,
kzsp, 'npar',
npar, 'kfc',
kfc, 'nx',
nx, 'nit',
nit)
Note: the interface to this routine has changed since earlier releases of the toolbox:
At Mark 23: |
nit and kfc were made optional |
Description
The time series
supplied to
nag_tsa_uni_arima_estim (g13ae) is assumed to follow a seasonal autoregressive integrated moving average (ARIMA) model defined as follows:
where
is the result of applying non-seasonal differencing of order
and seasonal differencing of seasonality
and order
to the series
, as outlined in the description of
nag_tsa_uni_diff (g13aa). The differenced series is then of length
, where
is the generalized order of differencing. The scalar
is the expected value of the differenced series, and the series
follows a zero-mean stationary autoregressive moving average (ARMA) model defined by a pair of recurrence equations. These express
in terms of an uncorrelated series
, via an intermediate series
. The first equation describes the seasonal structure:
The second equation describes the non-seasonal structure. If the model is purely non-seasonal the first equation is redundant and
above is equated with
:
Estimates of the model parameters defined by
and (optionally)
are obtained by minimizing a quadratic form in the vector
.
This is
, where
is the covariance matrix of
, and is a function of the model parameters. This matrix is not explicitly evaluated, since
may be expressed as a ‘sum of squares’ function. When moving average parameters
or
are present, so that the generalized moving average order
is positive, backforecasts
are introduced as nuisance parameters. The ‘sum of squares’ function may then be written as
where
is a combined vector of parameters, consisting of the backforecasts followed by the ARMA model parameters.
The terms correspond to the ARMA model residual series , and is the generalized autoregressive order. The terms are only present if autoregressive parameters are in the model, and serve to correct for transient errors introduced at the start of the autoregression.
The equations defining
and
are precisely:
- ,
for . - ,
for . - ,
for - ,
for .
For all four of these equations, the following conditions hold:
- if
- if
- if
- if
- if
Minimization of
with respect to
uses an extension of the algorithm of
Marquardt (1963).
The first derivatives of
with respect to the parameters are calculated as
where
and
are derivatives of
and
with respect to the
th parameter.
The second derivative of
is approximated by
Successive parameter iterates are obtained by calculating a vector of corrections
by solving the equations
where
is a vector with elements
,
is a matrix with elements
,
is a scalar used to control the search and
is the diagonal matrix of
.
The new parameter values are then .
The scalar controls the step size, to which it is inversely related.
If a step results in new parameter values which give a reduced value of , then is reduced by a factor . If a step results in new parameter values which give an increased value of , or in ARMA model parameters which in any way contravene the stationarity and invertibility conditions, then the new parameters are rejected, is increased by the factor , and the revised equations are solved for a new parameter correction.
This action is repeated until either a reduced value of is obtained, or reaches the limit of , which is used to indicate a failure of the search procedure.
This failure may be due to a badly conditioned sum of squares function or to too strict a convergence criterion. Convergence is deemed to have occurred if the fractional reduction in the residual sum of squares in successive iterations is less than a value , while .
The stationarity and invertibility conditions are tested to within a specified tolerance multiple of machine accuracy. Upon convergence, or completion of the specified maximum number of iterations without convergence, statistical properties of the estimates are derived. In the latter case the sequence of iterates should be checked to ensure that convergence is adequate for practical purposes, otherwise these properties are not reliable.
The estimated residual variance is
where
is the final value of
, and the residual number of degrees of freedom is given by
The covariance matrix of the vector of estimates
is given by
where
is evaluated at the final parameter values.
From this expression are derived the vector of standard deviations, and the correlation matrix for the whole parameter set. These are asymptotic approximations.
The differenced series
(now uncorrected for the constant), intermediate series
and residual series
are all available upon completion of the iterations over the range (extended by backforecasts)
The values
can only properly be interpreted as residuals for
, as the earlier values are corrupted by transients if
.
In consequence of the manner in which differencing is implemented, the residual is the one step ahead forecast error for .
For convenient application in forecasting, the following quantities constitute the ‘state set’, which contains the minimum amount of time series information needed to construct forecasts:
(i) |
the differenced series , for , |
(ii) |
the values required to reconstitute the original series from the differenced series , |
(iii) |
the intermediate series , for
, |
(iv) |
the residual series , for . |
This state set is available upon completion of the iterations. The function may be used purely for the construction of this state set, given a previously estimated model and time series
, by requesting zero iterations. Backforecasts are estimated, but the model parameter values are unchanged. If later observations become available and it is desired to update the state set,
nag_tsa_uni_arima_update (g13ag) can be used.
References
Box G E P and Jenkins G M (1976) Time Series Analysis: Forecasting and Control (Revised Edition) Holden–Day
Marquardt D W (1963) An algorithm for least squares estimation of nonlinear parameters J. Soc. Indust. Appl. Math. 11 431
Parameters
Compulsory Input Parameters
- 1:
– int64int32nag_int array
-
The orders vector of the ARIMA model whose parameters are to be estimated. , , and refer respectively to the number of autoregressive (), moving average , seasonal autoregressive () and seasonal moving average () parameters. , and refer respectively to the order of non-seasonal differencing, the order of seasonal differencing and the seasonal period.
Constraints:
- , , , , , , ;
- ;
- ;
- if , ;
- if , ;
- ;
- .
- 2:
– double array
-
The initial estimates of the values of the parameters, the values of the parameters, the values of the parameters and the values of the parameters, in that order.
- 3:
– double scalar
-
If
,
c must contain the expected value,
, of the differenced series.
If
,
c must contain an initial estimate of
.
- 4:
– double array
-
The values of the original undifferenced time series.
- 5:
– int64int32nag_int scalar
-
The dimension of the arrays
ex,
exr and
al.
Constraint:
, which is equivalent to the exit value of .
- 6:
– int64int32nag_int scalar
-
The dimension of the arrays
g and
sd and the second dimension of the arrays
h and
hc.
Constraint:
which is equivalent to the exit value of .
- 7:
– int64int32nag_int scalar
-
The dimension of the array
st.
Constraint:
.
- 8:
– function handle or string containing name of m-file
-
piv is used to monitor the progress of the optimization.
piv(mr, par, npar, c, kfc, icount, s, g, h, ldh, igh, itc, zsp)
Input Parameters
- 1:
– int64int32nag_int array
- 2:
– double array
- 3:
– int64int32nag_int scalar
- 4:
– double scalar
- 5:
– int64int32nag_int scalar
- 6:
– int64int32nag_int array
- 7:
– double scalar
- 8:
– double array
- 9:
– double array
- 10:
– int64int32nag_int scalar
- 11:
– int64int32nag_int scalar
- 12:
– int64int32nag_int scalar
- 13:
– double array
-
All the arguments are defined as for nag_tsa_uni_arima_estim (g13ae) itself.
If
a dummy
piv must be supplied.
- 9:
– int64int32nag_int scalar
-
Must be nonzero if the progress of the optimization is to be monitored using
piv. Otherwise
kpiv must contain
.
- 10:
– double array
-
When
, the first four elements of
zsp must contain the four values used to guide the search procedure. These are as follows.
contains , the value used to constrain the magnitude of the search procedure steps.
contains , the multiplier which regulates the value .
contains , the value of the stationarity and invertibility test tolerance factor.
contains , the value of the convergence criterion.
If
on entry, default values for
zsp are supplied by the function.
These are , , and respectively.
Constraint:
if , , , , .
- 11:
– int64int32nag_int scalar
-
The value
if the function is to use the input values of
zsp, and any other value if the default values of
zsp are to be used.
Optional Input Parameters
- 1:
– int64int32nag_int scalar
-
Default:
the dimension of the array
par.
The total number of , , and parameters to be estimated.
Constraint:
.
- 2:
– int64int32nag_int scalar
Default:
Must be set to if the constant, , is to be estimated and if it is to be held fixed at its initial value.
Constraint:
or .
- 3:
– int64int32nag_int scalar
-
Default:
the dimension of the array
x.
, the length of the original undifferenced time series.
- 4:
– int64int32nag_int scalar
Default:
The maximum number of iterations to be performed.
Constraint:
.
Output Parameters
- 1:
– double array
-
The latest values of the estimates of these parameters.
- 2:
– double scalar
-
If
,
c is unchanged.
If
,
c contains the latest estimate of
.
Therefore, if
c and
kfc are both zero on entry, there is no constant correction.
- 3:
– int64int32nag_int array
-
Size of various output arrays.
- Contains , the number of backforecasts.
- Contains , the number of differenced values.
- Contains , the number of values of reconstitution information.
- Contains , the number of values held in each of the series ex, exr and al.
- Contains , the number of degrees of freedom associated with .
- Contains , the number of parameters being estimated.
These values are always computed regardless of the exit value of
ifail.
- 4:
– double array
-
The extended differenced series which is made up of:
backforecast values of the differenced series.
actual values of the differenced series.
values of reconstitution information.
The total number of these values held in
ex is
.
If the function exits because of a faulty input argument, the contents of
ex will be indeterminate.
- 5:
– double array
-
The values of the model residuals which is made up of:
residuals corresponding to the backforecasts in the differenced series.
residuals corresponding to the actual values in the differenced series.
The remaining values contain zeros.
If the function exits with
ifail holding a value other than
or
, the contents of
exr will be indeterminate.
- 6:
– double array
-
The intermediate series which is made up of:
intermediate series values corresponding to the backforecasts in the differenced series.
intermediate series values corresponding to the actual values in the differenced series.
The remaining values contain zeros.
If the function exits with
, the contents of
al will be indeterminate.
- 7:
– double scalar
-
The residual sum of squares after the latest series of parameter estimates has been incorporated into the model. If the function exits with a faulty input argument,
s contains zero.
- 8:
– double array
-
The latest value of the derivatives of
with respect to each of the parameters being estimated (backforecasts,
par parameters, and where relevant the constant – in that order). The contents of
g will be indeterminate if the function exits with a faulty input argument.
- 9:
– double array
-
The standard deviations corresponding to each of the parameters being estimated (backforecasts,
par parameters, and where relevant the constant, in that order).
If the function exits with
ifail containing a value other than
or
, or if the required number of iterations is zero, the contents of
sd will be indeterminate.
- 10:
– double array
-
The second derivative of
and correlation coefficients.
(a) |
the latest values of an approximation to the second derivative of with respect to each of the parameters being estimated (backforecasts, par parameters, and where relevant the constant – in that order), and |
(b) |
the correlation coefficients relating to each pair of these parameters. |
These are held in a matrix defined by the first
rows and the first
columns of
h. (Note that
contains the value of this expression.) The values of
(a) are contained in the upper triangle, and the values of
(b) in the strictly lower triangle.
These correlation coefficients are zero during intermediate printout using
piv, and indeterminate if
ifail contains on exit a value other than
or
.
All the contents of
h are indeterminate if the required number of iterations are zero. The
th row of
h is used internally as workspace.
- 11:
– double array
-
The
nst values of the state set array. If the function exits with
ifail containing a value other than
or
, the contents of
st will be indeterminate.
- 12:
– int64int32nag_int scalar
-
The number of values in the state set array
st.
- 13:
– int64int32nag_int scalar
-
The number of iterations performed.
- 14:
– double array
-
zsp contains the values, default or otherwise, used by the function.
- 15:
– int64int32nag_int array
-
Contains success/failure indicators, one for each of the four types of parameter in the model (autoregressive, moving average, seasonal autoregressive, seasonal moving average), in that order.
Each indicator has the interpretation:
|
On entry parameters of this type have initial estimates which do not satisfy the stationarity or invertibility test conditions. |
|
The search procedure has failed to converge because the latest set of parameter estimates of this type is invalid. |
|
No parameter of this type is in the model. |
|
Valid final estimates for parameters of this type have been obtained. |
- 16:
– int64int32nag_int scalar
unless the function detects an error (see
Error Indicators and Warnings).
Error Indicators and Warnings
Note: nag_tsa_uni_arima_estim (g13ae) may return useful information for one or more of the following detected errors or warnings.
Errors or warnings detected by the function:
Cases prefixed with W are classified as warnings and
do not generate an error of type NAG:error_n. See nag_issue_warnings.
-
-
On entry, | , |
or | the orders vector mr is invalid (check it against the constraints in Arguments), |
or | or . |
-
-
On entry, , i.e., the number of terms in the differenced series is not greater than the number of parameters in the model. The model is over-parameterised.
-
-
On entry, | one or more of the user-supplied criteria for controlling the iterative process are invalid, |
or | , |
or | if , ; |
or | if , ; |
or | if , ; |
or | if , ; |
or | if , . |
-
-
On entry, the state set array
st is too small. The output value of
nst contains the required value (see the description of
ist in
Arguments for the formula).
-
-
On entry, the workspace array
wa is too small. Check the value of
iwa against the constraints in
Arguments.
- W
-
On entry, | , |
or | , |
or | . |
- W
-
This indicates a failure in the search procedure, with .
Some output arguments may contain meaningful values; see
Arguments for details.
- W
-
This indicates a failure to invert .
Some output arguments may contain meaningful values; see
Arguments for details.
- W
-
This indicates a failure in
nag_linsys_real_posdef_solve_1rhs (f04as) which is used to solve the equations giving the latest estimates of the backforecasts.
Some output arguments may contain meaningful values; see
Arguments for details.
- W
-
Satisfactory parameter estimates could not be obtained for all parameter types in the model. Inspect array
isf for further information on the parameter type(s) in error.
-
An unexpected error has been triggered by this routine. Please
contact
NAG.
-
Your licence key may have expired or may not have been installed correctly.
-
Dynamic memory allocation failed.
Accuracy
The computations are believed to be stable.
Further Comments
The time taken by nag_tsa_uni_arima_estim (g13ae) is approximately proportional to .
Example
The following program reads observations from a time series relating to the rate of the earth's rotation about its polar axis. Differencing of order is applied, and the number of non-seasonal parameters is , one autoregressive , and two moving average . No seasonal effects are taken into account.
The constant is estimated. Up to iterations are allowed.
The initial estimates of , , and are zero.
Open in the MATLAB editor:
g13ae_example
function g13ae_example
fprintf('g13ae example results\n\n');
x = [-217; -177; -166; -136; -110; -95; -64; -37; -14; -25;
-51; -62; -73; -88; -113; -120; -83; -33; -19; 21;
17; 44; 44; 78; 88; 122; 126; 114; 85; 64];
nx = numel(x);
mr = [int64(1);1;2;0;0;0;0];
npar = mr(1) + mr(3) + mr(4) + mr(6);
par = zeros(npar,1);
iex = mr(3) + (mr(6)*mr(7)) + nx;
igh = mr(3) + (mr(6)*mr(7)) + npar + 1;
ist = (mr(4)+mr(5))*mr(7) + mr(2) + mr(3) + max(mr(1),(mr(6)*mr(7)));
c = 0;
kpiv = int64(0);
zsp = [0.001; 10; 1000; 0.0001];
kzsp = int64(1);
[par, c, icount, ex, exr, al, s, g, sd, h, st, nst, itc, zsp, isf, ifail] = ...
g13ae( ...
mr, par, c, x, iex, igh, ist, @piv, kpiv, zsp, kzsp);
nex = icount(4);
ndf = icount(5);
ngh = icount(6);
fprintf('Convergence was achieved after %4d cycles\n\n',itc);
fprintf('Final values of par array and the constant c are as follows\n');
fprintf('%10.4f', par, c);
fprintf('\n\nResidual sum of squares is %10.3f with %4d %s\n\n', ...
s, ndf, 'degrees of freedom');
fprintf('The final values of ZSP were\n');
fprintf('%15.4e', zsp);
fprintf('\n\nThe number of parameters estimated was %4d\n', ngh);
fprintf('( backward forecasts, par and c, in that order )\n\n');
fprintf('The corresponding G array holds\n');
fprintf('%9.4f', g);
if itc>0
fprintf('\n\nThe corresponding SD array holds\n');
fprintf('%9.4f', sd);
fprintf('\n\n')
[ifail] = x04ca( ...
'General', ' ', h(1:ngh,:), 'Corresponding H matrix');
fprintf('\nHolds second derivatives in the upper half ');
fprintf('(including the main diagonal)\n');
fprintf('and correlation coefficients in the lower triangle\n');
end
fprintf('\n%s%5d%s\n','EX, EXR, and AL each hold', nex, ...
' values made up of');
fprintf('%5d%s\n', icount(1), ' back forecast(s),');
fprintf('%5d%s\n', icount(2), ' differenced values, and');
fprintf('%5d%s\n\n', icount(3), ' element(s) of reconstituted information');
fprintf(' Ex\n');
for j = 1:5:nex
fprintf('%11.5f', ex(j:min(j+4,nex)));
fprintf('\n');
end
fprintf('\n Exr\n');
for j = 1:5:nex
fprintf('%11.5f', exr(j:min(j+4,nex)));
fprintf('\n');
end
fprintf('\n Al\n');
for j = 1:5:nex
fprintf('%11.5f', al(j:min(j+4,nex)));
fprintf('\n');
end
fprintf('\nThe state set consists of %4d values\n',nst);
fprintf('%11.5f', st);
fprintf('\n');
function [] = piv(mr, par, npar, c, kfc, icount, s, g, h, ih, igh, itc, zsp)
fprintf('Iteration %d residual sum f squares = %16.4', itc, s);
g13ae example results
Convergence was achieved after 16 cycles
Final values of par array and the constant c are as follows
-0.0547 -0.5568 -0.6636 9.9807
Residual sum of squares is 9397.924 with 25 degrees of freedom
The final values of ZSP were
1.0000e-15 1.0000e+01 1.0000e+03 1.0000e-04
The number of parameters estimated was 6
( backward forecasts, par and c, in that order )
The corresponding G array holds
-0.1512 -0.2343 -6.4097 13.5617 -72.6232 -0.1642
The corresponding SD array holds
14.8379 15.1887 0.3507 0.2709 0.1695 7.3893
Corresponding H matrix
1 2 3 4 5
1 1.9416E+00 -6.1794E-01 2.4409E-01 1.7942E+00 -8.3579E-01
2 3.4176E-01 1.9446E+00 -1.6544E-01 -2.5084E-01 1.7952E+00
3 -1.0544E-02 5.5643E-03 9.0416E+03 -9.6825E+03 5.4626E+02
4 -1.2113E-02 5.6011E-03 8.1322E-01 1.7031E+04 -5.6761E+03
5 -2.3216E-03 -1.1495E-03 3.6741E-01 4.7942E-01 1.7028E+04
6 -1.4580E-01 -2.6004E-01 -4.0877E-02 -4.8389E-02 -3.7442E-02
6
1 2.4106E-01
2 8.5926E-01
3 8.1847E-01
4 6.9417E+00
5 6.3308E+00
6 7.4339E+00
Holds second derivatives in the upper half (including the main diagonal)
and correlation coefficients in the lower triangle
EX, EXR, and AL each hold 32 values made up of
2 back forecast(s),
29 differenced values, and
1 element(s) of reconstituted information
Ex
19.52500 5.87533 40.00000 11.00000 30.00000
26.00000 15.00000 31.00000 27.00000 23.00000
-11.00000 -26.00000 -11.00000 -11.00000 -15.00000
-25.00000 -7.00000 37.00000 50.00000 14.00000
40.00000 -4.00000 27.00000 0.00000 34.00000
10.00000 34.00000 4.00000 -12.00000 -29.00000
-21.00000 64.00000
Exr
19.52500 -3.92787 19.57110 -5.62907 10.22209
15.15821 -9.32757 16.42850 15.21154 -5.42106
-27.34437 -18.30612 5.38901 -12.98124 -22.47672
-15.21833 4.49436 33.68668 19.75860 -27.14696
32.24262 -12.27651 1.69412 -1.84650 23.37721
-10.45763 14.33018 -5.70614 -28.64010 -20.45020
-2.72147 0.00000
Al
19.52500 5.87533 30.01926 1.01926 20.01926
16.01926 5.01926 21.01926 17.01926 13.01926
-20.98074 -35.98074 -20.98074 -20.98074 -24.98074
-34.98074 -16.98074 27.01926 40.01926 4.01926
30.01926 -13.98074 17.01926 -9.98074 24.01926
0.01926 24.01926 -5.98074 -21.98074 -38.98074
-30.98074 0.00000
The state set consists of 4 values
64.00000 -30.98074 -20.45020 -2.72147
PDF version (NAG web site
, 64-bit version, 64-bit version)
© The Numerical Algorithms Group Ltd, Oxford, UK. 2009–2015