NAG Library Routine Document

If the root of the characteristic equation for a time series is one then that series is said to have a unit root. Such series are nonstationary. g01ewf is designed to be called after g13awf and returns the probability associated with one of three types of (augmented) Dickey–Fuller test statistic:

τ

τ_{μ}

τ_{τ}

, used to test for a unit root, a unit root with drift or a unit root with drift and a deterministic time trend, respectively. The three types of test statistic are constructed as follows:

To test whether a time series,

y_{t}

, for

t = 1, 2, \dots, n

, has a unit root the regression model

\nabla y_{t} = β_{1} y_{t - 1} + \sum_{i = 1}^{p - 1} δ_{i} \nabla y_{t - i} + ε_{t}

is fit and the test statistic

τ

constructed as

τ = \frac{{\hat{β}}_{1}}{σ_{11}}

where

\nabla

is the difference operator, with

\nabla y_{t} = y_{t} - y_{t - 1}

, and where

{\hat{β}}_{1}

and

σ_{11}

are the least squares estimate and associated standard error for

β_{1}

respectively.

To test for a unit root with drift the regression model

\nabla y_{t} = β_{1} y_{t - 1} + \sum_{i = 1}^{p - 1} δ_{i} \nabla y_{t - i} + α + ε_{t}

is fit and the test statistic

τ_{μ}

constructed as

τ_{μ} = \frac{{\hat{β}}_{1}}{σ_{11}} .

To test for a unit root with drift and deterministic time trend the regression model

\nabla y_{t} = β_{1} y_{t - 1} + \sum_{i = 1}^{p - 1} δ_{i} \nabla y_{t - i} + α + β_{2} t + ε_{t}

is fit and the test statistic

τ_{τ}

constructed as

τ_{τ} = \frac{{\hat{β}}_{1}}{σ_{11}} .

All three test statistics:

τ

τ_{μ}

and

τ_{τ}

can be calculated using g13awf.

The probability distributions of these statistics are nonstandard and are a function of the length of the series of interest,

n

. The probability associated with a given test statistic, for a given

n

, can therefore only be calculated by simulation as described in Dickey and Fuller (1979). However, such simulations require a significant number of iterations and are therefore prohibitively expensive in terms of the time taken. As such g01ewf also allows the probability to be interpolated from a look-up table. Two such tables are provided, one from Dickey (1976) and one constructed as described in Section 9. The three different methods of obtaining an estimate of the probability can be chosen via the method argument. Unless there is a specific reason for choosing otherwise,

method = 1

should be used.

4

References

Dickey A D (1976) Estimation and hypothesis testing in nonstationary time series PhD Thesis Iowa State University, Ames, Iowa

Dickey A D and Fuller W A (1979) Distribution of the estimators for autoregressive time series with a unit root J. Am. Stat. Assoc. 74 366 427–431

5

Arguments

1: $method$ – IntegerInput

On entry: the method used to calculate the probability.

$method = 1$: The probability is interpolated from a look-up table, whose values were obtained via simulation.
$method = 2$: The probability is interpolated from a look-up table, whose values were obtained from Dickey (1976).
$method = 3$: The probability is obtained via simulation.

The probability calculated from the look-up table should give sufficient accuracy for most applications.

Suggested value:

method = 1

Constraint:

method = 1

2

3

2: $type$ – IntegerInput

On entry: the type of test statistic, supplied in ts.

Constraint:

type = 1

2

3

3: $n$ – IntegerInput

On entry:

n

, the length of the time series used to calculate the test statistic.

Constraints:

if $method \neq 3$ , $n > 0$ ;
if $method = 3$ and $type = 1$ , $n > 2$ ;
if $method = 3$ and $type = 2$ , $n > 3$ ;
if $method = 3$ and $type = 3$ , $n > 4$ .

4: $ts$ – Real (Kind=nag_wp)Input

On entry: the Dickey–Fuller test statistic for which the probability is required. If

$type = 1$: ts must contain $τ$ .
$type = 2$: ts must contain $τ_{μ}$ .
$type = 3$: ts must contain $τ_{τ}$ .

If the test statistic was calculated using g13awf the value of type and n must not change between calls to g01ewf and g13awf.

5: $nsamp$ – IntegerInput

On entry: if

method = 3

, the number of samples used in the simulation; otherwise nsamp is not referenced and need not be set.

Constraint: if

method = 3

nsamp > 0

6: $state (*)$ – Integer arrayCommunication Array

Note: the actual argument supplied must be the array state supplied to the initialization routines g05kff or g05kgf.

On entry: if

method = 3

, state must contain information on the selected base generator and its current state; otherwise state is not referenced and need not be set.

On exit: if

method = 3

, state contains updated information on the state of the generator otherwise a zero length vector is returned.

7: $ifail$ – IntegerInput/Output

On entry: ifail must be set to

0

- 1 or 1

. If you are unfamiliar with this argument you should refer to Section 3.4 in How to Use the NAG Library and its Documentation for details.

For environments where it might be inappropriate to halt program execution when an error is detected, the value

- 1 or 1

is recommended. If the output of error messages is undesirable, then the value

1

is recommended. Otherwise, if you are not familiar with this argument, the recommended value is

0

. When the value $- 1 or 1$ is used it is essential to test the value of ifail on exit.

On exit:

ifail = 0

unless the routine detects an error or a warning has been flagged (see Section 6).

6

Error Indicators and Warnings

If on entry

ifail = 0

- 1

, explanatory error messages are output on the current error message unit (as defined by x04aaf).

Errors or warnings detected by the routine:

$ifail = 11$: On entry, $method = 〈value〉$ .
Constraint: $method = 1$ , $2$ or $3$ .

$ifail = 21$: On entry, $type = 〈value〉$ .
Constraint: $type = 1$ , $2$ or $3$ .

$ifail = 31$: On entry, $n = 〈value〉$ .
Constraint: if $method \neq 3$ , $n > 0$ .

On entry, $n = 〈value〉$ .
Constraint: if $method = 3$ and $type = 1$ , $n > 2$ .

On entry, $n = 〈value〉$ .
Constraint: if $method = 3$ and $type = 2$ , $n > 3$ .

On entry, $n = 〈value〉$ .
Constraint: if $method = 3$ and $type = 3$ , $n > 4$ .

$ifail = 51$: On entry, $nsamp = 〈value〉$ .
Constraint: if $method = 3$ , $nsamp > 0$ .

$ifail = 61$: On entry, $method = 3$ and the state vector has been corrupted or not initialized.

$ifail = 201$: The supplied input values were outside the range of at least one look-up table, therefore extrapolation was used.

$ifail = - 99$: An unexpected error has been triggered by this routine. Please contact NAG.
See Section 3.9 in How to Use the NAG Library and its Documentation for further information.

$ifail = - 399$: Your licence key may have expired or may not have been installed correctly.
See Section 3.8 in How to Use the NAG Library and its Documentation for further information.

$ifail = - 999$: Dynamic memory allocation failed.
See Section 3.7 in How to Use the NAG Library and its Documentation for further information.

7

Accuracy

When

method = 1

, the probability returned by this routine is unlikely to be accurate to more than

4

5

decimal places, for

method = 2

this accuracy is likely to drop to

2

3

decimal places (see Section 9 for details on how these probabilities are constructed). In both cases the accuracy of the probability is likely to be lower when extrapolation is used, particularly for small values of n (less than around

15

). When

method = 3

the accuracy of the returned probability is controlled by the number of simulations performed (i.e., the value of nsamp used).

8

Parallelism and Performance

g01ewf is not threaded in any implementation.

9

Further Comments

When

method = 1

2

the probability returned is constructed by interpolating from a series of look-up tables. In the case of

method = 2

the look-up tables are taken directly from Dickey (1976) and the interpolation is carried out using e01saf and e01sbf . For

method = 1

the look-up tables were constructed as follows:

(i)	A sample size, $n$ was chosen.
(ii)	$2^{28}$ simulations were run.
(iii)	At each simulation, a time series was constructed as described in chapter five of Dickey (1976). The relevant test statistic was then calculated for each of these time series.
(iv)	A series of quantiles were calculated from the sample of $2^{28}$ test statistics. The quantiles were calculated at intervals of $0.0005$ between $0.0005$ and $0.9995$ .
(v)	A spline was fit to the quantiles using e02bef.

This process was repeated for

n = 25, 50, 75, 100, 150, 200, 250, 300, 350, 400, 450, 500, 600, 700, 800, 900, 1000, 1500, 2000, 2500, 5000, 10000

, resulting in

22

splines.

Given the

22

splines, and a user-supplied sample size,

n

and test statistic,

τ

, an estimated

p

-value is calculated as follows:

(i)	Evaluate each of the $22$ splines, at $τ$ , using e02bef. If, for a particular spline, the supplied value of $τ$ lies outside of the range of the simulated data, then a third-order Taylor expansion is used to extrapolate, with the derivatives being calculated using e02bcf.
(ii)	Fit a spline through these $22$ points using e01bef.
(iii)	Estimate the $p$ -value using e01bff.