NAG Library Routine Document

G07BBF

exactly specified observations occur when $L_{i} = U_{i} = x_{i}$ ,
right-censored observations, known only by a lower bound, occur when $U_{i} \to \infty$ ,
left-censored observations, known only by a upper bound, occur when $L_{i} \to - \infty$ ,
and interval-censored observations when $L_{i} < x_{i} < U_{i}$ .

Let the set

A

identify the exactly specified observations, sets

B

and

C

identify the observations censored on the right and left respectively, and set

D

identify the observations confined between two finite limits. Also let there be

r

exactly specified observations, i.e., the number in

A

. The probability density function for the standard Normal distribution is

Z (x) = \frac{1}{\sqrt{2 π}} \exp (- \frac{1}{2} x^{2}), - \infty < x < \infty

and the cumulative distribution function is

P (X) = 1 - Q (X) = \int_{- \infty}^{X} Z (x) d x .

The log-likelihood of the sample can be written as:

L (μ, σ) = - r \log σ - \frac{1}{2} \sum_{A} {\{(x_{i} - μ) / σ\}}^{2} + \sum_{B} \log (Q (l_{i})) + \sum_{C} \log (P (u_{i})) + \sum_{D} \log (p_{i})

where

p_{i} = P (u_{i}) - P (l_{i})

and

u_{i} = (U_{i} - μ) / σ, l_{i} = (L_{i} - μ) / σ

Let

S (x_{i}) = \frac{Z (x_{i})}{Q (x_{i})}, S_{1} (l_{i}, u_{i}) = \frac{Z (l_{i}) - Z (u_{i})}{p_{i}}

and

S_{2} (l_{i}, u_{i}) = \frac{u_{i} Z (u_{i}) - l_{i} Z (l_{i})}{p_{i}},

then the first derivatives of the log-likelihood can be written as:

\frac{\partial L (μ, σ)}{\partial μ} = L_{1} (μ, σ) = σ^{- 2} \sum_{A} (x_{i} - μ) + σ^{- 1} \sum_{B} S (l_{i}) - σ^{- 1} \sum_{C} S (- u_{i}) + σ^{- 1} \sum_{D} S_{1} (l_{i}, u_{i})

and

\frac{\partial L (μ, σ)}{\partial σ} = L_{2} (μ, σ) = - r σ^{- 1} + σ^{- 3} \sum_{A} {(x_{i} - μ)}^{2} + σ^{- 1} \sum_{B} l_{i} S (l_{i}) - σ^{- 1} \sum_{C} u_{i} S (- u_{i})

- σ^{- 1} \sum_{D} S_{2} (l_{i}, u_{i})

The maximum likelihood estimates,

\hat{μ}

and

\hat{σ}

, are the solution to the equations:

L_{1} (\hat{μ}, \hat{σ}) = 0

(1)

and

L_{2} (\hat{μ}, \hat{σ}) = 0

(2)

and if the second derivatives

\frac{\partial^{2} L}{\partial^{2} μ}

\frac{\partial^{2} L}{\partial μ \partial σ}

and

\frac{\partial^{2} L}{\partial^{2} σ}

are denoted by

L_{11}

L_{12}

and

L_{22}

respectively, then estimates of the standard errors of

\hat{μ}

and

\hat{σ}

are given by:

se (\hat{μ}) = \sqrt{\frac{- L_{22}}{L_{11} L_{22} - L_{12}^{2}}}, se (\hat{σ}) = \sqrt{\frac{- L_{11}}{L_{11} L_{22} - L_{12}^{2}}}

and an estimate of the correlation of

\hat{μ}

and

\hat{σ}

is given by:

\frac{L_{12}}{\sqrt{L_{12} L_{22}}} .

To obtain the maximum likelihood estimates the equations (1) and (2) can be solved using either the Newton–Raphson method or the Expectation-maximization

(E M)

algorithm of Dempster et al. (1977).

Newton–Raphson Method

This consists of using approximate estimates

\tilde{μ}

and

\tilde{σ}

to obtain improved estimates

\tilde{μ} + δ \tilde{μ}

and

\tilde{σ} + δ \tilde{σ}

by solving

\begin{array}{l} δ \tilde{μ} L_{11} + δ \tilde{σ} L_{12} + L_{1} = 0, \\ δ \tilde{μ} L_{12} + δ \tilde{σ} L_{22} + L_{2} = 0, \end{array}

for the corrections

δ \tilde{μ}

and

δ \tilde{σ}

EM Algorithm

The expectation step consists of constructing the variable

w_{i}

as follows:

if i \in A, w_{i} = x_{i}

(3)

if i \in B, w_{i} = E (x_{i} ∣ x_{i} > L_{i}) = μ + σ S (l_{i})

(4)

if i \in C, w_{i} = E (x_{i} ∣ x_{i} < U_{i}) = μ - σ S (- u_{i})

(5)

if i \in D, w_{i} = E (x_{i} ∣ L_{i} < x_{i} < U_{i}) = μ + σ S_{1} (l_{i}, u_{i})

(6)

the maximization step consists of substituting (3), (4), (5) and (6) into (1) and (2) giving:

\hat{μ} = \sum_{i = 1}^{n} {\hat{w}}_{i} / n

(7)

and

{\hat{σ}}^{2} = \sum_{i = 1}^{n} {({\hat{w}}_{i} - \hat{μ})}^{2} / \{r + \sum_{B} T ({\hat{l}}_{i}) + \sum_{C} T (- {\hat{u}}_{i}) + \sum_{D} T_{1} ({\hat{l}}_{i}, {\hat{u}}_{i})\}

(8)

where

T (x) = S (x) \{S (x) - x\}, T_{1} (l, u) = S_{1}^{2} (l, u) + S_{2} (l, u)

and where

{\hat{w}}_{i}

{\hat{l}}_{i}

and

{\hat{u}}_{i}

are

w_{i}

l_{i}

and

u_{i}

evaluated at

\hat{μ}

and

\hat{σ}

. Equations (3) to (8) are the basis of the

E M

iterative procedure for finding

\hat{μ}

and

{\hat{σ}}^{2}

. The procedure consists of alternately estimating

\hat{μ}

and

{\hat{σ}}^{2}

using (7) and (8) and estimating

\{{\hat{w}}_{i}\}

using (3) to (6).

In choosing between the two methods a general rule is that the Newton–Raphson method converges more quickly but requires good initial estimates whereas the

E M

algorithm converges slowly but is robust to the initial values. In the case of the censored Normal distribution, if only a small proportion of the observations are censored then estimates based on the exact observations should give good enough initial estimates for the Newton–Raphson method to be used. If there are a high proportion of censored observations then the

E M

algorithm should be used and if high accuracy is required the subsequent use of the Newton–Raphson method to refine the estimates obtained from the

E M

algorithm should be considered.

4 References

Dempster A P, Laird N M and Rubin D B (1977) Maximum likelihood from incomplete data via the

E M

algorithm (with discussion) J. Roy. Statist. Soc. Ser. B 39 1–38

Swan A V (1969) Algorithm AS 16. Maximum likelihood estimation from grouped and censored normal data Appl. Statist. 18 110–114

Wolynetz M S (1979) Maximum likelihood estimation from confined and censored normal data Appl. Statist. 28 185–195

5 Parameters

1: METHOD – CHARACTER(1)Input

On entry: indicates whether the Newton–Raphson or

E M

algorithm should be used.

METHOD ='N'

, then the Newton–Raphson algorithm is used.

METHOD ='E'

, then the

E M

algorithm is used.

Constraint:

METHOD ='N'

'E'

2: N – INTEGERInput

On entry:

n

, the number of observations.

Constraint:

N \geq 2

3: X(N) – REAL (KIND=nag_wp) arrayInput

On entry: the observations

x_{i}

L_{i}

U_{i}

, for

i = 1, 2, \dots, n

If the observation is exactly specified – the exact value,

x_{i}

If the observation is right-censored – the lower value,

L_{i}

If the observation is left-censored – the upper value,

U_{i}

If the observation is interval-censored – the lower or upper value,

L_{i}

U_{i}

, (see XC).

4: XC(N) – REAL (KIND=nag_wp) arrayInput

On entry: if the

j

th observation, for

j = 1, 2, \dots, n

is an interval-censored observation then

XC (j)

should contain the complementary value to

X (j)

, that is, if

X (j) < XC (j)

, then

XC (j)

contains upper value,

U_{i}

, and if

X (j) > XC (j)

, then

XC (j)

contains lower value,

L_{i}

. Otherwise if the

j

th observation is exact or right- or left-censored

XC (j)

need not be set.

Note: if

X (j) = XC (j)

then the observation is ignored.

5: IC(N) – INTEGER arrayInput

On entry:

IC (i)

contains the censoring codes for the

i

th observation, for

i = 1, 2, \dots, n

IC (i) = 0

, the observation is exactly specified.

IC (i) = 1

, the observation is right-censored.

IC (i) = 2

, the observation is left-censored.

IC (i) = 3

, the observation is interval-censored.

Constraint:

IC (i) = 0

1

2

3

, for

i = 1, 2, \dots, n

6: XMU – REAL (KIND=nag_wp)Input/Output

On entry: if

XSIG > 0.0

the initial estimate of the mean,

μ

; otherwise XMU need not be set.

On exit: the maximum likelihood estimate,

\hat{μ}

, of

μ

7: XSIG – REAL (KIND=nag_wp)Input/Output

On entry: specifies whether an initial estimate of

μ

and

σ

are to be supplied.

$XSIG > 0.0$

XSIG is the initial estimate of

σ

and XMU must contain an initial estimate of

μ

$XSIG \leq 0.0$

Initial estimates of XMU and XSIG are calculated internally from:

(a)	the exact observations, if the number of exactly specified observations is $\geq 2$ ; or
(b)	the interval-censored observations; if the number of interval-censored observations is $\geq 1$ ; or
(c)	they are set to $0.0$ and $1.0$ respectively.

On exit: the maximum likelihood estimate,

\hat{σ}

, of

σ

8: TOL – REAL (KIND=nag_wp)Input

On entry: the relative precision required for the final estimates of

μ

and

σ

. Convergence is assumed when the absolute relative changes in the estimates of both

μ

and

σ

are less than TOL.

TOL = 0.0

, then a relative precision of

0.000005

is used.

Constraint:

machine precision < TOL \leq 1.0

TOL = 0.0

9: MAXIT – INTEGERInput

On entry: the maximum number of iterations.

MAXIT \leq 0

, then a value of

25

is used.

10: SEXMU – REAL (KIND=nag_wp)Output

On exit: the estimate of the standard error of

\hat{μ}

11: SEXSIG – REAL (KIND=nag_wp)Output

On exit: the estimate of the standard error of

\hat{σ}

12: CORR – REAL (KIND=nag_wp)Output

On exit: the estimate of the correlation between

\hat{μ}

and

\hat{σ}

13: DEV – REAL (KIND=nag_wp)Output

On exit: the maximized log-likelihood,

L (\hat{μ}, \hat{σ})

14: NOBS( $4$ ) – INTEGER arrayOutput

On exit: the number of the different types of each observation;

NOBS (1)

contains number of right-censored observations.

NOBS (2)

contains number of left-censored observations.

NOBS (3)

contains number of interval-censored observations.

NOBS (4)

contains number of exactly specified observations.

15: NIT – INTEGEROutput

On exit: the number of iterations performed.

16: WK( $2 \times N$ ) – REAL (KIND=nag_wp) arrayWorkspace

17: IFAIL – INTEGERInput/Output

On entry: IFAIL must be set to

0

- 1 ​ or ​ 1

. If you are unfamiliar with this parameter you should refer to Section 3.3 in the Essential Introduction for details.

For environments where it might be inappropriate to halt program execution when an error is detected, the value

- 1 ​ or ​ 1

is recommended. If the output of error messages is undesirable, then the value

1

is recommended. Otherwise, if you are not familiar with this parameter, the recommended value is

0

. When the value $- 1 or 1$ is used it is essential to test the value of IFAIL on exit.

On exit:

IFAIL = 0

unless the routine detects an error or a warning has been flagged (see Section 6).

6 Error Indicators and Warnings

If on entry

IFAIL = 0

- 1

, explanatory error messages are output on the current error message unit (as defined by X04AAF).

Errors or warnings detected by the routine:

$IFAIL = 1$

On entry,	$METHOD \neq'N'$ or $'E'$ ,
or	$N < 2$ ,
or	$IC (i) \neq 0$ , $1$ , $2$ or $3$ , for some $i$ ,
or	$TOL < 0.0$ ,
or	$0.0 < TOL < machine precision$ ,
or	$TOL > 1.0$ .

$IFAIL = 2$: The chosen method failed to converge in MAXIT iterations. You should either increase TOL or MAXIT or, if using the $E M$ algorithm try using the Newton–Raphson method with initial values those returned by the current call to G07BBF. All returned values will be reasonable approximations to the correct results if MAXIT is not very small.

$IFAIL = 3$: The chosen method is diverging. This will be due to poor initial values. You should try different initial values.

$IFAIL = 4$: G07BBF was unable to calculate the standard errors. This can be caused by the method starting to diverge when the maximum number of iterations was reached.

7 Accuracy

The accuracy is controlled by the parameter TOL.

If high precision is requested with the

E M

algorithm then there is a possibility that, due to the slow convergence, before the correct solution has been reached the increments of

\hat{μ}

and

\hat{σ}

may be smaller than TOL and the process will prematurely assume convergence.

8 Further Comments

The process is deemed divergent if three successive increments of

μ

σ

increase.

9 Example

A sample of

18

observations and their censoring codes are read in and the Newton–Raphson method used to compute the estimates.

NAG Library Routine DocumentG07BBF

+− Contents

1 Purpose

2 Specification

3 Description

4 References

5 Parameters

6 Error Indicators and Warnings

7 Accuracy

8 Further Comments

9 Example

9.1 Program Text

9.2 Program Data

9.3 Program Results

NAG Library Routine Document

G07BBF