PDF version (NAG web site
, 64-bit version, 64-bit version)
NAG Toolbox: nag_nonpar_test_chisq (g08cg)
Purpose
nag_nonpar_test_chisq (g08cg) computes the test statistic for the goodness-of-fit test for data with a chosen number of class intervals.
Syntax
[
chisq,
p,
ndf,
eval,
chisqi,
ifail] = g08cg(
ifreq,
cb,
dist,
par,
npest,
prob, 'nclass',
nclass)
[
chisq,
p,
ndf,
eval,
chisqi,
ifail] = nag_nonpar_test_chisq(
ifreq,
cb,
dist,
par,
npest,
prob, 'nclass',
nclass)
Description
The goodness-of-fit test performed by nag_nonpar_test_chisq (g08cg) is used to test the null hypothesis that a random sample arises from a specified distribution against the alternative hypothesis that the sample does not arise from the specified distribution.
Given a sample of size
, denoted by
, drawn from a random variable
, and that the data has been grouped into
classes,
then the
goodness-of-fit test statistic is defined by
where
is the observed frequency of the
th class, and
is the expected frequency of the
th class.
The expected frequencies are computed as
where
is the probability that
lies in the
th class, that is
These probabilities are either taken from a common probability distribution or are supplied by you. The available probability distributions within this function are:
- Normal distribution with mean , variance ;
- uniform distribution on the interval ;
- exponential distribution with probability density function ;
- -distribution with degrees of freedom; and
- gamma distribution with .
You must supply the frequencies and classes. Given a set of data and classes the frequencies may be calculated using
nag_stat_frequency_table (g01ae).
nag_nonpar_test_chisq (g08cg) returns the test statistic, , together with its degrees of freedom and the upper tail probability from the -distribution associated with the test statistic. Note that the use of the -distribution as an approximation to the distribution of the test statistic improves as the expected values in each class increase.
References
Conover W J (1980) Practical Nonparametric Statistics Wiley
Kendall M G and Stuart A (1973) The Advanced Theory of Statistics (Volume 2) (3rd Edition) Griffin
Siegel S (1956) Non-parametric Statistics for the Behavioral Sciences McGraw–Hill
Parameters
Compulsory Input Parameters
- 1:
– int64int32nag_int array
-
must specify the frequency of the th class, , for .
Constraint:
, for .
- 2:
– double array
-
must specify the upper boundary value for the th class, for .
Constraint:
. For the exponential, gamma and -distributions .
- 3:
– string (length ≥ 1)
-
Indicates for which distribution the test is to be carried out.
- The Normal distribution is used.
- The uniform distribution is used.
- The exponential distribution is used.
- The -distribution is used.
- The gamma distribution is used.
- You must supply the class probabilities in the array prob.
Constraint:
, , , , or .
- 4:
– double array
-
Must contain the parameters of the distribution which is being tested. If you supply the probabilities (i.e.,
) the array
par is not referenced.
If a Normal distribution is used then and must contain the mean, , and the variance, , respectively.
If a uniform distribution is used then and must contain the boundaries and respectively.
If an exponential distribution is used then must contain the parameter . is not used.
If a -distribution is used then must contain the number of degrees of freedom. is not used.
If a gamma distribution is used and must contain the parameters and respectively.
Constraints:
- if , ;
- if , and and ;
- if , ;
- if , ;
- if , and .
- 5:
– int64int32nag_int scalar
-
The number of estimated parameters of the distribution.
Constraint:
.
- 6:
– double array
-
If you are supplying the probability distribution (i.e.,
) then
must contain the probability that
lies in the
th class.
If
,
prob is not referenced.
Constraint:
if , , , for .
Optional Input Parameters
- 1:
– int64int32nag_int scalar
-
Default:
the dimension of the arrays
ifreq,
prob. (An error is raised if these dimensions are not equal.)
, the number of classes into which the data is divided.
Constraint:
.
Output Parameters
- 1:
– double scalar
-
The test statistic, , for the goodness-of-fit test.
- 2:
– double scalar
-
The upper tail probability from the -distribution associated with the test statistic, , and the number of degrees of freedom.
- 3:
– int64int32nag_int scalar
-
Contains , the degrees of freedom associated with the test.
- 4:
– double array
-
contains the expected frequency for the th class, , for .
- 5:
– double array
-
contains the contribution from the th class to the test statistic, that is, , for .
- 6:
– int64int32nag_int scalar
unless the function detects an error (see
Error Indicators and Warnings).
Error Indicators and Warnings
Note: nag_nonpar_test_chisq (g08cg) may return useful information for one or more of the following detected errors or warnings.
Errors or warnings detected by the function:
Cases prefixed with W are classified as warnings and
do not generate an error of type NAG:error_n. See nag_issue_warnings.
-
-
-
-
On entry, | dist is invalid. |
-
-
On entry, | , |
or | . |
-
-
On entry, | for some , for . |
-
-
On entry, the elements of
cb are not in ascending order. That is,
for some
, for
.
-
-
On entry, , or and . No negative class boundary values are valid for the exponential, gamma or -distributions.
-
-
On entry, | the values provided in par are invalid. |
-
-
On entry, | with , for some , for , |
or | . |
-
-
An expected frequency is equal to zero when the observed frequency was not.
- W
-
This is a warning that expected values for certain classes are less than . This implies that we cannot be confident that the -distribution is a good approximation to the distribution of the test statistic.
- W
-
The solution obtained when calculating the probability for a certain class for the gamma or -distribution did not converge in iterations. The solution may be an adequate approximation.
-
An unexpected error has been triggered by this routine. Please
contact
NAG.
-
Your licence key may have expired or may not have been installed correctly.
-
Dynamic memory allocation failed.
Accuracy
The computations are believed to be stable.
Further Comments
The time taken by nag_nonpar_test_chisq (g08cg) is dependent both on the distribution chosen and on the number of classes, .
Example
This example applies the
goodness-of-fit test to test whether there is evidence to suggest that a sample of
randomly generated observations do not arise from a uniform distribution
. The class intervals are calculated such that the interval
is divided into five equal classes. The frequencies for each class are calculated using
nag_stat_frequency_table (g01ae).
Open in the MATLAB editor:
g08cg_example
function g08cg_example
fprintf('g08cg example results\n\n');
x = [ 0.59 0.23 0.76 0.96 0.20 0.91 0.29 0.22 0.36 0.81 ...
0.91 0.80 0.17 0.82 0.07 0.74 0.15 0.91 0.26 0.98 ...
0.59 0.34 0.28 0.95 0.33 0.42 0.72 0.35 0.86 0.22 ...
0.15 0.39 0.32 0.82 0.13 0.48 0.46 0.74 0.99 0.26 ...
0.04 0.21 0.04 0.24 0.56 0.36 0.48 0.53 1.00 0.58 ...
0.50 0.41 0.03 0.38 0.89 0.40 0.66 0.79 0.34 0.94 ...
0.49 0.12 0.24 0.05 1.00 0.29 0.67 0.29 0.75 0.81 ...
0.45 0.21 0.51 0.68 0.78 0.20 0.23 0.57 0.25 0.48 ...
0.96 0.33 0.48 0.55 0.04 0.48 0.42 0.11 0.38 0.73 ...
0.91 0.45 0.59 0.97 0.27 0.27 0.25 0.99 0.99 0.80];
cb = [0.2; 0.4; 0.6; 0.8; 1.0 ];
nclass = int64(5);
[~, ifreq, ~, ~, ifail] = ...
g01ae( ...
nclass, x, 'cb', cb);
dist = 'Uniform';
npest = int64(0);
par = [0; 1];
prob = zeros(nclass,1);
[chisq, p, ndf, eval, chisqi, ifail] = ...
g08cg( ...
ifreq, cb, dist, par, npest, prob, 'nclass', nclass);
fprintf('Chi-squared test statistic = %10.4f\n', chisq);
fprintf('Degrees of freedom. = %5d\n', ndf);
fprintf('Significance level = %10.4f\n\n', p);
fprintf('The contributions to the test statistic are :-\n');
disp(chisqi');
g08cg example results
Chi-squared test statistic = 14.2000
Degrees of freedom. = 4
Significance level = 0.0067
The contributions to the test statistic are :-
3.2000 6.0500 0.4500 4.0500 0.4500
PDF version (NAG web site
, 64-bit version, 64-bit version)
© The Numerical Algorithms Group Ltd, Oxford, UK. 2009–2015