naginterfaces.library.stat.contingency_table¶
- naginterfaces.library.stat.contingency_table(nobs, num=0)[source]¶
contingency_table
performs the analysis of a two-way contingency table or classification. If , and the total number of objects classified is or fewer, then the probabilities for Fisher’s exact test are computed. Otherwise, a test statistic is computed (with Yates’ correction when ), which under the assumption of no association between the classifications has approximately a chi-square distribution with degrees of freedom.For full information please refer to the NAG Library document for g01af
https://support.nag.com/numeric/nl/nagdoc_30.3/flhtml/g01/g01aff.html
- Parameters
- nobsint, array-like, shape
The elements , for , for , must contain the frequencies for the two-way classification. The th row and the th column of need not be set.
- numint, optional
The value assigned to must determine whether automatic ‘shrinkage’ is required when any , as outlined in Notes(1).
If , shrinkage is required, otherwise shrinkage is not required.
- Returns
- nobsint, ndarray, shape
Contains the following information:
, for , for , contain the frequencies for the two-way classification after ‘shrinkage’ has taken place (see Notes).
, for , contain the total frequencies in the remaining rows, .
, for , contain the total frequencies in the remaining columns, .
, contains the total frequency, .
If any ‘shrinkage’ has occurred, all other cells contain no useful information.
- numint
When Fisher’s exact test for a classification is used then contains the number of elements used in the array , otherwise is set to zero.
- predfloat, ndarray, shape
The elements , where and contain the expected frequencies, corresponding to the observed frequencies , except in the case when Fisher’s exact test for a classification is to be used, when is not used. No other elements are utilized.
- chisfloat
The value of the test statistic, , except when Fisher’s exact test for a classification is used in which case it is unspecified.
- pfloat, ndarray, shape
The first elements contain the probabilities associated with the various possible frequency tables, , for , the remainder are unspecified.
- nposint
holds the probability associated with the given table of frequencies.
- ndfint
The value of gives the number of degrees of freedom for the chi-square distribution, ; when Fisher’s exact test is used .
- m1int
The number of rows of the two-way classification, after any ‘shrinkage’, .
- n1int
The number of columns of the two-way classification, after any ‘shrinkage’, .
- Raises
- NagValueError
- (errno )
The number of rows or columns of is less than .
- (errno )
On entry, .
Constraint: .
- (errno )
On entry, .
Constraint: .
- (errno )
At least one frequency is negative, or all frequencies are zero.
- Notes
No equivalent traditional C interface for this routine exists in the NAG Library.
The data consist of the frequencies for the two-way classification, denoted by , for , for with .
A check is made to see whether any row or column of the matrix of frequencies consists entirely of zeros, and if so, the matrix of frequencies is reduced by omitting that row or column. Suppose the final size of the matrix is (), and let
, the total frequency for the th row, for ,
, the total frequency for the th column, for , and
, the total frequency.
There are two situations:
If and/or , or and , then the matrix of expected frequencies, denoted by , for and , and the test statistic, , are computed, where
and
where
is Yates’ correction for continuity.
Under the assumption that there is no association between the two classifications, will have approximately a chi-square distribution with degrees of freedom.
An option exists which allows for further ‘shrinkage’ of the matrix of frequencies in the case where for the ()th cell. If this is the case, then row or column will be combined with the adjacent row or column with smaller total. Row is selected for combination if . This ‘shrinking’ process is continued until for all cells ().
If and , the probabilities to enable Fisher’s exact test to be made are computed.
The matrix of frequencies may be rearranged so that is the smallest marginal (i.e., column and row) total, and . Under the assumption of no association between the classifications, the probability of obtaining entries in cell is computed where
The probability of obtaining the table of given frequencies is returned. A test of the assumption against some alternative may then be made by summing the relevant values of .