PDF version (NAG web site
, 64-bit version, 64-bit version)
NAG Toolbox: nag_stat_frequency_table (g01ae)
Purpose
nag_stat_frequency_table (g01ae) constructs a frequency distribution of a variable, according to either user-supplied, or function-calculated class boundary values.
Syntax
Note: the interface to this routine has changed since earlier releases of the toolbox:
At Mark 23: |
iclass is no longer an input parameter; cb was made optional; k was made a compulsory input parameter |
Description
The data consists of a sample of observations of a continuous variable, denoted by , for . Let and .
nag_stat_frequency_table (g01ae) constructs a frequency distribution with classes denoted by , for .
The boundary values may be either user-supplied, or function-calculated, and are denoted by , for .
If the boundary values of the classes are to be function-calculated, then they are determined in one of the following ways:
(a) |
if , the range of values is divided into intervals of equal length, and two extreme intervals, defined by the class boundary values ; |
(b) |
if , . |
However formed, the values
are assumed to be in ascending order. The class frequencies are formed with
- the number of values in the interval
- the number of values in the interval ,
- the number of values in the interval ,
where [ means inclusive, and ) means exclusive. If the class boundary values are function-calculated and
, then
, and
and
are chosen so that
and
If a frequency distribution is required for a discrete variable, then it is suggested that you supply the class boundary values; function-calculated boundary values may be slightly imprecise (due to the adjustment of and outlined above) and cause values very close to a class boundary to be assigned to the wrong class.
References
None.
Parameters
Compulsory Input Parameters
- 1:
– int64int32nag_int scalar
-
, the number of classes desired in the frequency distribution. Whether or not class boundary values are user-supplied,
k must include the two extreme classes which stretch to
.
Constraint:
.
- 2:
– double array
-
The sample of observations of the variable for which the frequency distribution is required,
, for . The values may be in any order.
Optional Input Parameters
- 1:
– int64int32nag_int scalar
-
Default:
the dimension of the array
x.
, the number of observations.
Constraint:
.
- 2:
– double array
-
If
cb is not supplied,
nag_stat_frequency_table (g01ae) calculates
class boundary values.
If
cb is supplied, the first
elements of
cb must contain the class boundary values you supplied, in ascending order.
Constraint:
, for .
Output Parameters
- 1:
– double array
-
The first
elements of
cb contain the class boundary values in ascending order.
- 2:
– int64int32nag_int array
-
The elements of
ifreq contain the frequencies in each class,
, for
. In particular
contains the frequency of the class up to
,
, and
contains the frequency of the class greater than
,
.
- 3:
– double scalar
-
The smallest value in the sample, .
- 4:
– double scalar
-
The largest value in the sample, .
- 5:
– int64int32nag_int scalar
unless the function detects an error (see
Error Indicators and Warnings).
Error Indicators and Warnings
Errors or warnings detected by the function:
-
-
-
-
-
-
On entry, | the user-supplied class boundary values are not in ascending order. |
-
An unexpected error has been triggered by this routine. Please
contact
NAG.
-
Your licence key may have expired or may not have been installed correctly.
-
Dynamic memory allocation failed.
Accuracy
The method used is believed to be stable.
Further Comments
The time taken by
nag_stat_frequency_table (g01ae) increases with
k and
n. It also depends on the distribution of the sample observations.
Example
This example summarises a number of datasets. For each dataset the sample observations and optionally class boundary values are read. nag_stat_frequency_table (g01ae) is then called and the frequency distribution and largest and smallest observations printed.
Open in the MATLAB editor:
g01ae_example
function g01ae_example
fprintf('g01ae example results\n\n');
x = [22.3; 21.6; 22.6; 22.4; 22.4; 22.4; 22.1; 21.9; 23.1; 23.4; 23.4;
22.6; 22.5; 22.5; 22.1; 22.6; 22.3; 22.4; 21.8; 22.3; 22.1; 23.6;
20.8; 22.2; 23.1; 21.1; 21.7; 21.4; 21.6; 22.5; 21.2; 22.6; 22.2;
22.2; 21.4; 21.7; 23.2; 23.1; 22.3; 22.3; 21.1; 21.4; 21.5; 21.8;
22.8; 21.4; 20.7; 21.6; 23.2; 23.6; 22.7; 21.7; 23.0; 21.9; 22.6;
22.1; 22.2; 23.4; 21.5; 23.0; 22.8; 21.4; 23.2; 21.8; 21.2; 22.0;
22.4; 22.8; 23.2; 23.6];
k = int64(7);
[cb, ifreq, xmin, xmax, ifail] = g01ae(k, x);
fprintf('Number of cases %3d\n',size(x,1));
fprintf('Number of classes %3d\n\n',k);
fprintf('Routine-supplied class boundaries\n\n');
fprintf(' Class Frequency\n');
fprintf('%9s to%7.2f%14d\n', 'Up', cb(1), ifreq(1));
for i=2:k-1
fprintf('%7.2f to%7.2f%14d\n', cb(i-1), cb(i), ifreq(i));
end
fprintf('%7.2f and%7s%14d\n\n', cb(k-1), 'over', ifreq(k));
fprintf('Total frequency = %5d\n',sum(ifreq));
fprintf('Minimum = %8.2f\n',xmin);
fprintf('Maximum = %8.2f\n',xmax);
g01ae example results
Number of cases 70
Number of classes 7
Routine-supplied class boundaries
Class Frequency
Up to 20.70 0
20.70 to 21.28 6
21.28 to 21.86 16
21.86 to 22.44 21
22.44 to 23.02 14
23.02 to 23.60 13
23.60 and over 0
Total frequency = 70
Minimum = 20.70
Maximum = 23.60
PDF version (NAG web site
, 64-bit version, 64-bit version)
© The Numerical Algorithms Group Ltd, Oxford, UK. 2009–2015