naginterfaces.library.tsa.uni_​arima_​estim

naginterfaces.library.tsa.uni_arima_estim(mr, par, c, x, iex, igh, ist, kpiv, zsp, kzsp, kfc=1, piv=None, nit=100, data=None, io_manager=None)[source]

uni_arima_estim fits a seasonal autoregressive integrated moving average (ARIMA) model to an observed time series, using a nonlinear least squares procedure incorporating backforecasting. Parameter estimates are obtained, together with appropriate standard errors. The residual series is returned, and information for use in forecasting the time series is produced for use by the functions uni_arima_update() and uni_arima_forecast_state().

The estimation procedure is iterative, starting with initial parameter values such as may be obtained using uni_arima_prelim(). It continues until a specified convergence criterion is satisfied, or until a specified number of iterations has been carried out. The progress of the procedure can be monitored by means of a user-supplied function.

For full information please refer to the NAG Library document for g13ae

https://support.nag.com/numeric/nl/nagdoc_30.1/flhtml/g13/g13aef.html

Parameters
mrint, array-like, shape

The orders vector of the ARIMA model whose parameters are to be estimated. , , and refer respectively to the number of autoregressive (), moving average , seasonal autoregressive () and seasonal moving average () parameters. , and refer respectively to the order of non-seasonal differencing, the order of seasonal differencing and the seasonal period.

parfloat, array-like, shape

The initial estimates of the values of the parameters, the values of the parameters, the values of the parameters and the values of the parameters, in that order.

cfloat

If , must contain the expected value, , of the differenced series.

If , must contain an initial estimate of .

xfloat, array-like, shape

The values of the original undifferenced time series.

iexint

The dimension of the arrays , and .

ighint

The dimension of the arrays and .

The second dimension of the arrays and .

istint

The dimension of the array .

kpivint

Must be nonzero if the progress of the optimization is to be monitored using . Otherwise must contain .

zspfloat, array-like, shape

When , the first four elements of must contain the four values used to guide the search procedure. These are as follows.

contains , the value used to constrain the magnitude of the search procedure steps.

contains , the multiplier which regulates the value .

contains , the value of the stationarity and invertibility test tolerance factor.

contains , the value of the convergence criterion.

If on entry, default values for are supplied by the function.

These are , , and respectively.

kzspint

The value if the function is to use the input values of , and any other value if the default values of are to be used.

kfcint, optional

Must be set to if the constant, , is to be estimated and if it is to be held fixed at its initial value.

pivNone or callable piv(mr, par, c, kfc, icount, s, g, h, itc, zsp, data=None), optional

Note: if this argument is None then a NAG-supplied facility will be used.

is used to monitor the progress of the optimization.

Parameters
mrint, ndarray, shape

, defined as for uni_arima_estim.

parfloat, ndarray, shape

, defined as for uni_arima_estim.

cfloat

, defined as for uni_arima_estim.

kfcint

, defined as for uni_arima_estim.

icountint, ndarray, shape

, defined as for uni_arima_estim.

sfloat

, defined as for uni_arima_estim.

gfloat, ndarray, shape

, defined as for uni_arima_estim.

hfloat, ndarray, shape

, defined as for uni_arima_estim.

itcint

, defined as for uni_arima_estim.

zspfloat, ndarray, shape

, defined as for uni_arima_estim.

dataarbitrary, optional, modifiable in place

User-communication data for callback functions.

nitint, optional

The maximum number of iterations to be performed.

dataarbitrary, optional

User-communication data for callback functions.

io_managerFileObjManager, optional

Manager for I/O in this routine.

Returns
parfloat, ndarray, shape

The latest values of the estimates of these parameters.

cfloat

If , is unchanged.

If , contains the latest estimate of .

Therefore, if and are both zero on entry, there is no constant correction.

icountint, ndarray, shape

Size of various output arrays.

Contains , the number of backforecasts.

Contains , the number of differenced values.

Contains , the number of values of reconstitution information.

Contains , the number of values held in each of the series , and .

Contains , the number of degrees of freedom associated with .

Contains , the number of parameters being estimated.

These values are always computed regardless of the exit value of .

exfloat, ndarray, shape

The extended differenced series which is made up of:

backforecast values of the differenced series.

actual values of the differenced series.

values of reconstitution information.

The total number of these values held in is .

If the function exits because of a faulty input parameter, the contents of will be indeterminate.

exrfloat, ndarray, shape

The values of the model residuals which is made up of:

residuals corresponding to the backforecasts in the differenced series.

residuals corresponding to the actual values in the differenced series.

The remaining values contain zeros.

If the function exits with holding a value other than or , the contents of will be indeterminate.

alfloat, ndarray, shape

The intermediate series which is made up of:

intermediate series values corresponding to the backforecasts in the differenced series.

intermediate series values corresponding to the actual values in the differenced series.

The remaining values contain zeros.

If an exception is raised, the contents of will be indeterminate.

sfloat

The residual sum of squares after the latest series of parameter estimates has been incorporated into the model. If the function exits with a faulty input parameter, contains zero.

gfloat, ndarray, shape

The latest value of the derivatives of with respect to each of the parameters being estimated (backforecasts, parameters, and where relevant the constant – in that order). The contents of will be indeterminate if the function exits with a faulty input parameter.

sdfloat, ndarray, shape

The standard deviations corresponding to each of the parameters being estimated (backforecasts, parameters, and where relevant the constant, in that order).

If the function exits with containing a value other than or , or if the required number of iterations is zero, the contents of will be indeterminate.

hfloat, ndarray, shape

The second derivative of and correlation coefficients.

  1. the latest values of an approximation to the second derivative of with respect to each of the parameters being estimated (backforecasts, parameters, and where relevant the constant – in that order), and

  2. the correlation coefficients relating to each pair of these parameters.

These are held in a matrix defined by the first rows and the first columns of . (Note that contains the value of this expression.) The values of (a) are contained in the upper triangle, and the values of (b) in the strictly lower triangle.

These correlation coefficients are zero during intermediate printout using , and indeterminate if contains on exit a value other than or .

All the contents of are indeterminate if the required number of iterations are zero.

The th row of is used internally as workspace.

stfloat, ndarray, shape

The values of the state set array. If the function exits with containing a value other than or , the contents of will be indeterminate.

nstint

The number of values in the state set array .

itcint

The number of iterations performed.

zspfloat, ndarray, shape

contains the values, default or otherwise, used by the function.

isfint, ndarray, shape

Contains success/failure indicators, one for each of the four types of parameter in the model (autoregressive, moving average, seasonal autoregressive, seasonal moving average), in that order.

Each indicator has the interpretation:

On entry parameters of this type have initial estimates which do not satisfy the stationarity or invertibility test conditions.

The search procedure has failed to converge because the latest set of parameter estimates of this type is invalid.

No parameter of this type is in the model.

Valid final estimates for parameters of this type have been obtained.

Raises
NagValueError
(errno )

On entry, .

Constraint: or .

(errno )

The orders vector is invalid.

(errno )

On entry, .

Constraint: .

(errno )

The model is over-parameterised.

(errno )

On entry, .

Constraint: .

(errno )

On entry, .

Constraint: .

(errno )

On entry, .

Constraint: .

(errno )

On entry, .

Constraint: .

(errno )

On entry, .

Constraint: .

(errno )

On entry, and the minimum size .

Constraint: .

Warns
NagAlgorithmicWarning
(errno )

On entry, .

Constraint: .

(errno )

On entry, .

Constraint: .

(errno )

On entry, .

Constraint: .

(errno )

A failure in the search procedure has occurred.

(errno )

Failure to invert .

(errno )

Unable to calculate the latest estimates of the backforecasts.

(errno )

Satisfactory parameter estimates could not be obtained for all parameter types in the model.

Notes

No equivalent traditional C interface for this routine exists in the NAG Library.

The time series supplied to uni_arima_estim is assumed to follow a seasonal autoregressive integrated moving average (ARIMA) model defined as follows:

where is the result of applying non-seasonal differencing of order and seasonal differencing of seasonality and order to the series , as outlined in the description of uni_diff(). The differenced series is then of length , where is the generalized order of differencing. The scalar is the expected value of the differenced series, and the series follows a zero-mean stationary autoregressive moving average (ARMA) model defined by a pair of recurrence equations. These express in terms of an uncorrelated series , via an intermediate series . The first equation describes the seasonal structure:

The second equation describes the non-seasonal structure. If the model is purely non-seasonal the first equation is redundant and above is equated with :

Estimates of the model parameters defined by

and (optionally) are obtained by minimizing a quadratic form in the vector .

This is , where is the covariance matrix of , and is a function of the model parameters. This matrix is not explicitly evaluated, since may be expressed as a ‘sum of squares’ function. When moving average parameters or are present, so that the generalized moving average order is positive, backforecasts are introduced as nuisance parameters. The ‘sum of squares’ function may then be written as

where is a combined vector of parameters, consisting of the backforecasts followed by the ARMA model parameters.

The terms correspond to the ARMA model residual series , and is the generalized autoregressive order. The terms are only present if autoregressive parameters are in the model, and serve to correct for transient errors introduced at the start of the autoregression.

The equations defining and are precisely:

, for .

, for .

, for

, for .

For all four of these equations, the following conditions hold:

if

if

if

if

if

Minimization of with respect to uses an extension of the algorithm of Marquardt (1963).

The first derivatives of with respect to the parameters are calculated as

where and are derivatives of and with respect to the th parameter.

The second derivative of is approximated by

Successive parameter iterates are obtained by calculating a vector of corrections by solving the equations

where is a vector with elements , is a matrix with elements , is a scalar used to control the search and is the diagonal matrix of .

The new parameter values are then .

The scalar controls the step size, to which it is inversely related.

If a step results in new parameter values which give a reduced value of , then is reduced by a factor . If a step results in new parameter values which give an increased value of , or in ARMA model parameters which in any way contravene the stationarity and invertibility conditions, then the new parameters are rejected, is increased by the factor , and the revised equations are solved for a new parameter correction.

This action is repeated until either a reduced value of is obtained, or reaches the limit of , which is used to indicate a failure of the search procedure.

This failure may be due to a badly conditioned sum of squares function or to too strict a convergence criterion. Convergence is deemed to have occurred if the fractional reduction in the residual sum of squares in successive iterations is less than a value , while .

The stationarity and invertibility conditions are tested to within a specified tolerance multiple of machine accuracy. Upon convergence, or completion of the specified maximum number of iterations without convergence, statistical properties of the estimates are derived. In the latter case the sequence of iterates should be checked to ensure that convergence is adequate for practical purposes, otherwise these properties are not reliable.

The estimated residual variance is

where is the final value of , and the residual number of degrees of freedom is given by

The covariance matrix of the vector of estimates is given by

where is evaluated at the final parameter values.

From this expression are derived the vector of standard deviations, and the correlation matrix for the whole parameter set. These are asymptotic approximations.

The differenced series (now uncorrected for the constant), intermediate series and residual series are all available upon completion of the iterations over the range (extended by backforecasts)

The values can only properly be interpreted as residuals for , as the earlier values are corrupted by transients if .

In consequence of the manner in which differencing is implemented, the residual is the one step ahead forecast error for .

For convenient application in forecasting, the following quantities constitute the ‘state set’, which contains the minimum amount of time series information needed to construct forecasts:

  1. the differenced series , for ,

  2. the values required to reconstitute the original series from the differenced series ,

  3. the intermediate series , for ,

  4. the residual series , for .

This state set is available upon completion of the iterations. The function may be used purely for the construction of this state set, given a previously estimated model and time series , by requesting zero iterations. Backforecasts are estimated, but the model parameter values are unchanged. If later observations become available and it is desired to update the state set, uni_arima_update() can be used.

References

Box, G E P and Jenkins, G M, 1976, Time Series Analysis: Forecasting and Control, (Revised Edition), Holden–Day

Marquardt, D W, 1963, An algorithm for least squares estimation of nonlinear parameters, J. Soc. Indust. Appl. Math. (11), 431