NAG FL Interface
g08cgf (test_​chisq)

1 Purpose

g08cgf computes the test statistic for the χ2 goodness-of-fit test for data with a chosen number of class intervals.

2 Specification

Fortran Interface
Subroutine g08cgf ( nclass, ifreq, cb, dist, par, npest, prob, chisq, p, ndf, eval, chisqi, ifail)
Integer, Intent (In) :: nclass, ifreq(nclass), npest
Integer, Intent (Inout) :: ifail
Integer, Intent (Out) :: ndf
Real (Kind=nag_wp), Intent (In) :: cb(nclass-1), par(2), prob(nclass)
Real (Kind=nag_wp), Intent (Out) :: chisq, p, eval(nclass), chisqi(nclass)
Character (1), Intent (In) :: dist
C Header Interface
#include <nag.h>
void  g08cgf_ (const Integer *nclass, const Integer ifreq[], const double cb[], const char *dist, const double par[], const Integer *npest, const double prob[], double *chisq, double *p, Integer *ndf, double eval[], double chisqi[], Integer *ifail, const Charlen length_dist)
The routine may be called by the names g08cgf or nagf_nonpar_test_chisq.

3 Description

The χ2 goodness-of-fit test performed by g08cgf is used to test the null hypothesis that a random sample arises from a specified distribution against the alternative hypothesis that the sample does not arise from the specified distribution.
Given a sample of size n, denoted by x1,x2,,xn, drawn from a random variable X, and that the data has been grouped into k classes,
xc1, ci-1<xci, i=2,3,,k-1, x>ck-1,  
then the χ2 goodness-of-fit test statistic is defined by
X2=i=1k Oi-Ei 2Ei,  
where Oi is the observed frequency of the ith class, and Ei is the expected frequency of the ith class.
The expected frequencies are computed as
Ei=pi×n,  
where pi is the probability that X lies in the ith class, that is
p1=PXc1, pi=Pci-1<Xci, i=2,3,,k-1, pk=PX>ck-1.  
These probabilities are either taken from a common probability distribution or are supplied by you. The available probability distributions within this routine are:
You must supply the frequencies and classes. Given a set of data and classes the frequencies may be calculated using g01aef.
g08cgf returns the χ2 test statistic, X2, together with its degrees of freedom and the upper tail probability from the χ2-distribution associated with the test statistic. Note that the use of the χ2-distribution as an approximation to the distribution of the test statistic improves as the expected values in each class increase.

4 References

Conover W J (1980) Practical Nonparametric Statistics Wiley
Kendall M G and Stuart A (1973) The Advanced Theory of Statistics (Volume 2) (3rd Edition) Griffin
Siegel S (1956) Non-parametric Statistics for the Behavioral Sciences McGraw–Hill

5 Arguments

1: nclass Integer Input
On entry: k, the number of classes into which the data is divided.
Constraint: nclass2.
2: ifreqnclass Integer array Input
On entry: ifreqi must specify the frequency of the ith class, Oi, for i=1,2,,k.
Constraint: ifreqi0, for i=1,2,,k.
3: cbnclass-1 Real (Kind=nag_wp) array Input
On entry: cbi must specify the upper boundary value for the ith class, for i=1,2,,k-1.
Constraint: cb1<cb2<<cbnclass-1. For the exponential, gamma and χ2-distributions cb10.0.
4: dist Character(1) Input
On entry: indicates for which distribution the test is to be carried out.
dist='N'
The Normal distribution is used.
dist='U'
The uniform distribution is used.
dist='E'
The exponential distribution is used.
dist='C'
The χ2-distribution is used.
dist='G'
The gamma distribution is used.
dist='A'
You must supply the class probabilities in the array prob.
Constraint: dist='N', 'U', 'E', 'C', 'G' or 'A'.
5: par2 Real (Kind=nag_wp) array Input
On entry: must contain the parameters of the distribution which is being tested. If you supply the probabilities (i.e., dist='A') the array par is not referenced.
If a Normal distribution is used then par1 and par2 must contain the mean, μ, and the variance, σ2, respectively.
If a uniform distribution is used then par1 and par2 must contain the boundaries a and b respectively.
If an exponential distribution is used then par1 must contain the parameter λ. par2 is not used.
If a χ2-distribution is used then par1 must contain the number of degrees of freedom. par2 is not used.
If a gamma distribution is used par1 and par2 must contain the parameters α and β respectively.
Constraints:
  • if dist='N', par2>0.0;
  • if dist='U', par1<par2 and par1cb1 and par2cbnclass-1;
  • if dist='E', par1>0.0;
  • if dist='C', par1>0.0;
  • if dist='G', par1>0.0 and par2>0.0.
6: npest Integer Input
On entry: the number of estimated parameters of the distribution.
Constraint: 0npest<nclass-1.
7: probnclass Real (Kind=nag_wp) array Input
On entry: if you are supplying the probability distribution (i.e., dist='A') then probi must contain the probability that X lies in the ith class.
If dist'A', prob is not referenced.
Constraints:
if dist='A',
  • probi>0.0, for i=1,2,,k;
  • i=1kprobi=1.0.
8: chisq Real (Kind=nag_wp) Output
On exit: the test statistic, X2, for the χ2 goodness-of-fit test.
9: p Real (Kind=nag_wp) Output
On exit: the upper tail probability from the χ2-distribution associated with the test statistic, X2, and the number of degrees of freedom.
10: ndf Integer Output
On exit: contains nclass-1-npest, the degrees of freedom associated with the test.
11: evalnclass Real (Kind=nag_wp) array Output
On exit: evali contains the expected frequency for the ith class, Ei, for i=1,2,,k.
12: chisqinclass Real (Kind=nag_wp) array Output
On exit: chisqii contains the contribution from the ith class to the test statistic, that is, Oi-Ei 2/Ei, for i=1,2,,k.
13: ifail Integer Input/Output
On entry: ifail must be set to 0, -1 or 1 to set behaviour on detection of an error; these values have no effect when no error is detected.
A value of 0 causes the printing of an error message and program execution will be halted; otherwise program execution continues. A value of -1 means that an error message is printed while a value of 1 means that it is not.
If halting is not appropriate, the value -1 or 1 is recommended. If message printing is undesirable, then the value 1 is recommended. Otherwise, the value -1 is recommended since useful values can be provided in some output arguments even when ifail0 on exit. When the value -1 or 1 is used it is essential to test the value of ifail on exit.
On exit: ifail=0 unless the routine detects an error or a warning has been flagged (see Section 6).

6 Error Indicators and Warnings

If on entry ifail=0 or -1, explanatory error messages are output on the current error message unit (as defined by x04aaf).
Errors or warnings detected by the routine:
Note: in some cases g08cgf may return useful information.
ifail=1
On entry, nclass=value.
Constraint: nclass2.
ifail=2
On entry, dist=value.
Constraint: dist='N', 'U', 'E', 'C', 'G' or 'A'.
ifail=3
On entry, npest=value.
Constraint: 0npest<nclass-1.
ifail=4
On entry, i=value and ifreqi=value.
Constraint: ifreqi0.
ifail=5
On entry, i=value, cbi-1=value and cbi=value.
Constraint: cbi-1<cbi.
ifail=6
On entry, cb1=value.
Constraint: cb10.0.
ifail=7
On entry, par1=value.
Constraint: for the exponential distribution, par1>0.0.
On entry, par1=value.
Constraint: for the χ2 distribution, par1>0.0.
On entry, par1=value and par2=value.
Constraint: for the gamma distribution, par1>0.0 and par2>0.0.
On entry, par1=value and par2=value.
Constraint: for the uniform distribution, par1<par2, par1cb1 and par2cbnclass-1.
On entry, par2=value.
Constraint: for the Normal distribution, par2>0.0.
ifail=8
On entry, i=value and probi=value.
Constraint: prob>0.0
On entry, iprobi=value.
Constraint: iprobi=1.0.
ifail=9
An expected frequency equals zero, when the observed frequency was not.
ifail=10
At least one class has an expected frequency less than 1. The χ2 distribution may not be a good approximation to the distribution of the test statistic.
ifail=11
The solution has failed to converge whilst computing the expected values. The returned solution may be an adequate approximation.
ifail=-99
An unexpected error has been triggered by this routine. Please contact NAG.
See Section 7 in the Introduction to the NAG Library FL Interface for further information.
ifail=-399
Your licence key may have expired or may not have been installed correctly.
See Section 8 in the Introduction to the NAG Library FL Interface for further information.
ifail=-999
Dynamic memory allocation failed.
See Section 9 in the Introduction to the NAG Library FL Interface for further information.

7 Accuracy

The computations are believed to be stable.

8 Parallelism and Performance

g08cgf is not threaded in any implementation.

9 Further Comments

The time taken by g08cgf is dependent both on the distribution chosen and on the number of classes, k.

10 Example

This example applies the χ2 goodness-of-fit test to test whether there is evidence to suggest that a sample of 100 randomly generated observations do not arise from a uniform distribution U0,1. The class intervals are calculated such that the interval 0,1 is divided into five equal classes. The frequencies for each class are calculated using g01aef.

10.1 Program Text

Program Text (g08cgfe.f90)

10.2 Program Data

Program Data (g08cgfe.d)

10.3 Program Results

Program Results (g08cgfe.r)