naginterfaces.library.contab.tabulate_​stat

naginterfaces.library.contab.tabulate_stat(stat, update, weight, isf, lfac, ifac, y, wt, table, ncells, icount, auxt)[source]

tabulate_stat computes a table from a set of classification factors using a selected statistic.

For full information please refer to the NAG Library document for g11ba

https://support.nag.com/numeric/nl/nagdoc_30/flhtml/g11/g11baf.html

Parameters
statstr, length 1

Indicates which statistic is to be computed for the table cells.

The number of observations for each cell.

The total for the variable in for each cell.

The average (mean) for the variable in for each cell.

The variance for the variable in for each cell.

The largest value for the variable in for each cell.

The smallest value for the variable in for each cell.

updatestr, length 1

Indicates if an existing table is to be updated by further observation.

The table cells will be initialized to zero before tabulations take place.

The table input in will be updated. The arguments , , and must remain unchanged from the previous call to tabulate_stat.

weightstr, length 1

Indicates if weights are to be used.

Weights are not used and unit weights are assumed.

or

Weights are used and must be supplied in . The only difference between and is if the variance is computed.

The divisor for the variance is the sum of the weights minus one and if , the divisor is the number of observations with nonzero weights minus one. The former is useful if the weights represent the frequency of the observed values.

If or , the weighted total or mean is computed respectively.

If , or , the only effect of weights is to eliminate values with zero weights from the computations.

isfint, array-like, shape

Indicates which factors in are to be used in the tabulation.

If the th factor in is included in the tabulation.

Note that if , for then the statistic for the whole sample is calculated and returned in a table.

lfacint, array-like, shape

The number of levels of the classifying factors in .

ifacint, array-like, shape

The coded classification factors for the observations.

yfloat, array-like, shape

The variable to be tabulated. If , is not referenced.

wtfloat, array-like, shape

Note: the required length for this argument is determined as follows: if : ; otherwise: .

If or , must contain the weights. Otherwise is not referenced.

tablefloat, array-like, shape

If , must be unchanged from the previous call to tabulate_stat, otherwise need not be set.

ncellsint

If , must be unchanged from the previous call to tabulate_stat, otherwise need not be set.

icountint, array-like, shape

If , must be unchanged from the previous call to tabulate_stat, otherwise need not be set.

auxtfloat, array-like, shape

Note: the required length for this argument is determined as follows: if : ; if : ; otherwise: .

If , must be unchanged from the previous call to tabulate_stat, otherwise need not be set.

Returns
tablefloat, ndarray, shape

The computed table. The cells of the table are stored so that for any two factors the index relating to the factor referred to later in and changes faster. For further details see Further Comments.

ncellsint

The number of cells in the table.

ndimint

The number of factors defining the table.

idimint, ndarray, shape

The first elements contain the number of levels for the factors defining the table.

icountint, ndarray, shape

A table containing the number of observations contributing to each cell of the table, stored identically to . Note if this is the same as is returned in .

auxtfloat, ndarray, shape

If or , the first values hold the table containing the sum of the weights for the observations contributing to each cell, stored identically to .

If , the second set of values hold the table of cell means.

Otherwise is not referenced.

Raises
NagValueError
(errno )

On entry, .

Constraint: , or .

(errno )

On entry, .

Constraint: or .

(errno )

On entry, .

Constraint: , , , , or .

(errno )

On entry, .

Constraint: .

(errno )

On entry, .

Constraint: .

(errno )

On entry, and minimum value for .

Constraint: of the levels of the factors included in the tabulation.

(errno )

On entry, and .

Constraint: .

(errno )

On entry, , , and .

Constraint: .

(errno )

On entry, , and .

Constraint: .

(errno )

On entry, and .

On entry, .

(errno )

The variance divisor .

(errno )

or has changed between calls.

(errno )

has changed between calls.

(errno )

has changed between calls.

Notes

In the NAG Library the traditional C interface for this routine uses a different algorithmic base. Please contact NAG if you have any questions about compatibility.

A dataset may include both classification variables and general variables. The classification variables, known as factors, take a small number of values known as levels. For example, the factor sex would have the levels male and female. These can be coded as and respectively. Given several factors, a multi-way table can be constructed such that each cell of the table represents one level from each factor. For example, the two factors sex and habitat, habitat having three levels (inner-city, suburban and rural) define the contingency table

[table omitted]

For each cell statistics can be computed. If a third variable in the dataset was age, then for each cell the average age could be computed:

[table omitted]

That is the average age for all observations for males living in rural areas is . Other statistics can also be computed: the number of observations, the total, the variance, the largest value and the smallest value.

tabulate_stat computes a table for one of the selected statistics. The factors have to be coded with levels . Weights can be used to eliminate values from the calculations, e.g., if they represent ‘missing values’. There is also the facility to update an existing table with the addition of new observations.

References

John, J A and Quenouille, M H, 1977, Experiments: Design and Analysis, Griffin

Kendall, M G and Stuart, A, 1969, The Advanced Theory of Statistics (Volume 1), (3rd Edition), Griffin

West, D H D, 1979, Updating mean and variance estimates: An improved method, Comm. ACM (22), 532–555