The routine may be called by the names g08alf or nagf_nonpar_test_cochranq.
3Description
Cochran's $Q$-test may be used to test for differences between $k$ treatments applied independently to $n$ individuals or blocks ($k$ related samples of equal size $n$), where the observed response can take only one of two possible values; for example a treatment may result in a ‘success’ or ‘failure’. The data is recorded as either $1$ or $0$ to represent this dichotomization.
The use of this ‘randomized block design’ allows the effect of differences between the blocks to be separated from the differences between the treatments. The test assumes that the blocks were randomly selected from all possible blocks and that the result may be one of two possible outcomes common to all treatments within blocks.
The null and alternative hypotheses to be tested may be stated as follows.
${H}_{0}$ :
the treatments are equally effective, that is the probability of obtaining a $1$ within a block is the same for each treatment.
${H}_{1}$ :
there is a difference between the treatments, that is the probability of obtaining a $1$ is not the same for different treatments within blocks.
The data is often represented in the form of a table with the $n$ rows representing the blocks and the $k$ columns the treatments. Let ${R}_{\mathit{i}}$ represent the row totals, for $\mathit{i}=1,2,\dots ,n$, and ${C}_{\mathit{j}}$ represent the column totals, for $\mathit{j}=1,2,\dots ,k$. Let ${x}_{ij}$ represent the response or result where ${x}_{ij}=0$ or $1$.
Blocks
Treatments
Row Totals
1
2
$\mathit{k}$
1
${x}_{11}$
${x}_{12}$
$\cdots $
${x}_{1k}$
${R}_{1}$
2
${x}_{21}$
${x}_{22}$
$\cdots $
${x}_{2k}$
${R}_{2}$
$\vdots $
$\vdots $
$\vdots $
$n$
${x}_{n1}$
${x}_{n2}$
$\cdots $
${x}_{nk}$
${R}_{n}$
Column Totals
${C}_{1}$
${C}_{2}$
${C}_{k}$
$N=\text{Grand Total}$
If ${p}_{ij}=\mathrm{Pr}({x}_{ij}=1)$, for $i=1,2,\dots ,n$ and $j=1,2,\dots ,k$, then the hypotheses may be restated as follows
${H}_{0}$ :
${p}_{i1}={p}_{i2}=\cdots ={p}_{ik}$, for each $i=1,2,\dots ,n$.
${H}_{1}$:
${p}_{ij}\ne {p}_{ik}$, for some $j$ and $k$, and for some $i$.
When the number of blocks, $n$, is large relative to the number of treatments, $k$, $Q$ has an approximate ${\chi}^{2}$-distribution with $k-1$ degrees of freedom. This is used to find the probability, $p$, of obtaining a statistic greater than or equal to the computed value of $Q$. Thus $p$ is the upper tail probability associated with the computed value of $Q$, where the ${\chi}^{2}$-distribution is used to approximate the true distribution of $Q$.
4References
Conover W J (1980) Practical Nonparametric Statistics Wiley
Siegel S (1956) Non-parametric Statistics for the Behavioral Sciences McGraw–Hill
5Arguments
1: $\mathbf{n}$ – IntegerInput
On entry: $n$, the number of blocks.
Constraint:
${\mathbf{n}}\ge 2$.
2: $\mathbf{k}$ – IntegerInput
On entry: $k$, the number of treatments.
Constraint:
${\mathbf{k}}\ge 2$.
3: $\mathbf{x}({\mathbf{ldx}},{\mathbf{k}})$ – Real (Kind=nag_wp) arrayInput
On entry: the matrix of observed zero-one data.
${\mathbf{x}}(\mathit{i},\mathit{j})$ must contain the value ${x}_{\mathit{i}\mathit{j}}$, for $\mathit{i}=1,2,\dots ,n$ and $\mathit{j}=1,2,\dots ,k$.
Constraint:
${\mathbf{x}}(\mathit{i},\mathit{j})=0.0$ or $1.0$, for $\mathit{i}=1,2,\dots ,n$ and $\mathit{j}=1,2,\dots ,k$.
4: $\mathbf{ldx}$ – IntegerInput
On entry: the first dimension of the array x as declared in the (sub)program from which g08alf is called.
Constraint:
${\mathbf{ldx}}\ge {\mathbf{n}}$.
5: $\mathbf{q}$ – Real (Kind=nag_wp)Output
On exit: the value of the Cochran $Q$-test statistic.
6: $\mathbf{prob}$ – Real (Kind=nag_wp)Output
On exit: the upper tail probability, $p$, associated with the Cochran $Q$-test statistic, that is the probability of obtaining a value greater than or equal to the observed value (the output value of q).
7: $\mathbf{ifail}$ – IntegerInput/Output
On entry: ifail must be set to $0$, $\mathrm{-1}$ or $1$ to set behaviour on detection of an error; these values have no effect when no error is detected.
A value of $0$ causes the printing of an error message and program execution will be halted; otherwise program execution continues. A value of $\mathrm{-1}$ means that an error message is printed while a value of $1$ means that it is not.
If halting is not appropriate, the value $\mathrm{-1}$ or $1$ is recommended. If message printing is undesirable, then the value $1$ is recommended. Otherwise, the value $0$ is recommended. When the value $-\mathbf{1}$ or $\mathbf{1}$ is used it is essential to test the value of ifail on exit.
On exit: ${\mathbf{ifail}}={\mathbf{0}}$ unless the routine detects an error or a warning has been flagged (see Section 6).
6Error Indicators and Warnings
If on entry ${\mathbf{ifail}}=0$ or $\mathrm{-1}$, explanatory error messages are output on the current error message unit (as defined by x04aaf).
Errors or warnings detected by the routine:
${\mathbf{ifail}}=1$
On entry, ${\mathbf{k}}=\u27e8\mathit{\text{value}}\u27e9$.
Constraint: ${\mathbf{k}}\ge 2$.
On entry, ${\mathbf{ldx}}=\u27e8\mathit{\text{value}}\u27e9$ and ${\mathbf{n}}=\u27e8\mathit{\text{value}}\u27e9$.
Constraint: ${\mathbf{ldx}}\ge {\mathbf{n}}$.
On entry, ${\mathbf{n}}=\u27e8\mathit{\text{value}}\u27e9$.
Constraint: ${\mathbf{n}}\ge 2$.
${\mathbf{ifail}}=2$
On entry, $i=\u27e8\mathit{\text{value}}\u27e9$, $j=\u27e8\mathit{\text{value}}\u27e9$ and ${\mathbf{x}}(i,j)=\u27e8\mathit{\text{value}}\u27e9$.
Constraint: ${\mathbf{x}}(i,j)=0.0$ or $1.0$.
${\mathbf{ifail}}=3$
The solution has failed to converge while calculating the tail probability. The returned result may still be a reasonable approximation.
${\mathbf{ifail}}=-99$
An unexpected error has been triggered by this routine. Please
contact NAG.
See Section 7 in the Introduction to the NAG Library FL Interface for further information.
${\mathbf{ifail}}=-399$
Your licence key may have expired or may not have been installed correctly.
See Section 8 in the Introduction to the NAG Library FL Interface for further information.
${\mathbf{ifail}}=-999$
Dynamic memory allocation failed.
See Section 9 in the Introduction to the NAG Library FL Interface for further information.
7Accuracy
The use of the ${\chi}^{2}$-distribution as an approximation to the true distribution of the Cochran $Q$-test statistic improves as $k$ increases and as $n$ increases relative to $k$. This approximation should be a reasonable one when the total number of observations left, after omitting those rows containing all $0$ or $1$, is greater than about $25$ and the number of rows left is larger than $5$.
8Parallelism and Performance
g08alf is not threaded in any implementation.
9Further Comments
None.
10Example
The following example is taken from page 201 of Conover (1980). The data represents the success of three basketball enthusiasts in predicting the outcome of $12$ collegiate basketball games, selected at random, using $1$ for successful prediction of the outcome and $0$ for unsuccessful prediction. This data is read in and the Cochran $Q$-test statistic and its corresponding upper tail probability are computed and printed.