Integer type:  int32  int64  nag_int  show int32  show int32  show int64  show int64  show nag_int  show nag_int

Chapter Contents
Chapter Introduction
NAG Toolbox

# NAG Toolbox: nag_surviv_kaplanmeier (g12aa)

## Purpose

nag_surviv_kaplanmeier (g12aa) computes the Kaplan–Meier, (or product-limit), estimates of survival probabilities for a sample of failure times.

## Syntax

[nd, tp, p, psig, ifail] = g12aa(t, ic, freq, ifreq, 'n', n)
[nd, tp, p, psig, ifail] = nag_surviv_kaplanmeier(t, ic, freq, ifreq, 'n', n)

## Description

A survivor function, $S\left(t\right)$, is the probability of surviving to at least time $t$ with $S\left(t\right)=1-F\left(t\right)$, where $F\left(t\right)$ is the cumulative distribution function of the failure times. The Kaplan–Meier or product limit estimator provides an estimate of $S\left(t\right)$, $\stackrel{^}{S}\left(t\right)$, from sample of failure times which may be progressively right-censored.
Let ${t}_{i}$, $i=1,2,\dots ,{n}_{d}$, be the ordered distinct failure times for the sample of observed failure/censored times, and let the number of observations in the sample that have not failed by time ${t}_{i}$ be ${n}_{i}$. If a failure and a loss (censored observation) occur at the same time ${t}_{i}$, then the failure is treated as if it had occurred slightly before time ${t}_{i}$ and the loss as if it had occurred slightly after ${t}_{i}$.
The Kaplan–Meier estimate of the survival probabilities is a step function which in the interval ${t}_{i}$ to ${t}_{i+1}$ is given by
 $S^t=∏j=1i nj-djnj ,$
where ${d}_{j}$ is the number of failures occurring at time ${t}_{j}$.
nag_surviv_kaplanmeier (g12aa) computes the Kaplan–Meier estimates and the corresponding estimates of the variances, $\stackrel{^}{\text{var}}\left(\stackrel{^}{S}\left(t\right)\right)$, using Greenwood's formula,
 $var^S^t=S^ t 2∑j=1idjnjnj-dj .$

## References

Gross A J and Clark V A (1975) Survival Distributions: Reliability Applications in the Biomedical Sciences Wiley
Kalbfleisch J D and Prentice R L (1980) The Statistical Analysis of Failure Time Data Wiley

## Parameters

### Compulsory Input Parameters

1:     $\mathrm{t}\left({\mathbf{n}}\right)$ – double array
The failure and censored times; these need not be ordered.
2:     $\mathrm{ic}\left({\mathbf{n}}\right)$int64int32nag_int array
${\mathbf{ic}}\left(\mathit{i}\right)$ contains the censoring code of the $\mathit{i}$th observation, for $\mathit{i}=1,2,\dots ,{\mathbf{n}}$.
${\mathbf{ic}}\left(i\right)=0$
The $i$th observation is a failure time.
${\mathbf{ic}}\left(i\right)=1$
The $i$th observation is right-censored.
Constraint: ${\mathbf{ic}}\left(\mathit{i}\right)=0$ or $1$, for $\mathit{i}=1,2,\dots ,{\mathbf{n}}$.
3:     $\mathrm{freq}$ – string (length ≥ 1)
Indicates whether frequencies are provided for each time point.
${\mathbf{freq}}=\text{'F'}$
Frequencies are provided for each failure and censored time.
${\mathbf{freq}}=\text{'S'}$
The failure and censored times are considered as single observations, i.e., a frequency of $1$ is assumed.
Constraint: ${\mathbf{freq}}=\text{'F'}$ or $\text{'S'}$.
4:     $\mathrm{ifreq}\left(:\right)$int64int32nag_int array
The dimension of the array ifreq must be at least ${\mathbf{n}}$ if ${\mathbf{freq}}=\text{'F'}$ and at least $1$ if ${\mathbf{freq}}=\text{'S'}$
If ${\mathbf{freq}}=\text{'F'}$, ${\mathbf{ifreq}}\left(i\right)$ must contain the frequency of the $i$th observation.
If ${\mathbf{ifreq}}=\text{'S'}$, a frequency of $1$ is assumed and ifreq is not referenced.
Constraint: if ${\mathbf{freq}}=\text{'F'}$, ${\mathbf{ifreq}}\left(\mathit{i}\right)\ge 0$, for $\mathit{i}=1,2,\dots ,{\mathbf{n}}$.

### Optional Input Parameters

1:     $\mathrm{n}$int64int32nag_int scalar
Default: the dimension of the arrays ic, t. (An error is raised if these dimensions are not equal.)
The number of failure and censored times given in t.
Constraint: ${\mathbf{n}}\ge 2$.

### Output Parameters

1:     $\mathrm{nd}$int64int32nag_int scalar
The number of distinct failure times, ${n}_{d}$.
2:     $\mathrm{tp}\left({\mathbf{n}}\right)$ – double array
${\mathbf{tp}}\left(\mathit{i}\right)$ contains the $\mathit{i}$th ordered distinct failure time, ${t}_{\mathit{i}}$, for $\mathit{i}=1,2,\dots ,{n}_{\mathrm{d}}$.
3:     $\mathrm{p}\left({\mathbf{n}}\right)$ – double array
${\mathbf{p}}\left(\mathit{i}\right)$ contains the Kaplan–Meier estimate of the survival probability, $\stackrel{^}{S}\left(t\right)$, for time ${\mathbf{tp}}\left(\mathit{i}\right)$, for $\mathit{i}=1,2,\dots ,{n}_{d}$.
4:     $\mathrm{psig}\left({\mathbf{n}}\right)$ – double array
${\mathbf{psig}}\left(\mathit{i}\right)$ contains an estimate of the standard deviation of ${\mathbf{p}}\left(\mathit{i}\right)$, for $\mathit{i}=1,2,\dots ,{n}_{d}$.
5:     $\mathrm{ifail}$int64int32nag_int scalar
${\mathbf{ifail}}={\mathbf{0}}$ unless the function detects an error (see Error Indicators and Warnings).

## Error Indicators and Warnings

Errors or warnings detected by the function:
${\mathbf{ifail}}=1$
 On entry, ${\mathbf{n}}<2$.
${\mathbf{ifail}}=2$
 On entry, ${\mathbf{freq}}\ne \text{'F'}$ or $\text{'S'}$.
${\mathbf{ifail}}=3$
 On entry, ${\mathbf{ic}}\left(i\right)\ne 0$ or $1$, for some $i=1,2,\dots ,{\mathbf{n}}$.
${\mathbf{ifail}}=4$
 On entry, ${\mathbf{freq}}=\text{'F'}$ and ${\mathbf{ifreq}}\left(i\right)<0$, for some $i=1,2,\dots ,{\mathbf{n}}$.
${\mathbf{ifail}}=-99$
${\mathbf{ifail}}=-399$
Your licence key may have expired or may not have been installed correctly.
${\mathbf{ifail}}=-999$
Dynamic memory allocation failed.

## Accuracy

The computations are believed to be stable.

If there are no censored observations, $\stackrel{^}{S}\left(t\right)$ reduces to the ordinary binomial estimate of the probability of survival at time $t$.

## Example

The remission times for a set of $21$ leukaemia patients at $18$ distinct time points are read in and the Kaplan–Meier estimate computed and printed. For further details see page 242 of Gross and Clark (1975).
```function g12aa_example

fprintf('g12aa example results\n\n');

t  = [         6;  6;  7;  9; 10; 10; 11; 13; 16;
17; 19; 20; 22; 23; 25; 32; 34; 35];
ic = [int64(1);  0;  0;  1;  0;  1;  1;  0;  0;
1;  1;  1;  0;  0;  1;  1;  1;  1];

freq  = 'Frequencies';
ifreq = ones(numel(t),1,'int64');
ifreq(2)  = 3;
ifreq(16) = 2;

% Calculate Kaplan-Meier statistic
[nd, tp, p, psig, ifail] = g12aa( ...
t, ic, freq, ifreq);

% Display the results
fprintf('  Time   Survival    Standard\n');
fprintf('        probability  deviation\n\n');
fprintf('%6.1f%10.3f%12.3f\n', [tp(1:nd) p(1:nd) psig(1:nd)]');

fig1 = figure;
stp = [0; tp(1:nd)];
sp  = [1; p(1:nd)];
stairs(stp,sp);
xlabel('Time');
ylabel('Survival probability');
title('Kaplan Meier plot');
legend('Off');
axis([0 tp(nd)+1 0 1.1]);

```
```g12aa example results

Time   Survival    Standard
probability  deviation

6.0     0.857       0.076
7.0     0.807       0.087
10.0     0.753       0.096
13.0     0.690       0.107
16.0     0.627       0.114
22.0     0.538       0.128
23.0     0.448       0.135
```