nag_approx_quantiles_fixed (g01anc) finds approximate quantiles from a data stream of known size using an out-of-core algorithm.
A quantile is a value which divides a frequency distribution such that there is a given proportion of data values below the quantile. For example, the median of a dataset is the quantile because half the values are less than or equal to it.
nag_approx_quantiles_fixed (g01anc) uses a slightly modified version of an algorithm described in a paper by
Zhang and Wang (2007) to determine
-approximate quantiles of a data stream of
real values, where
is known. Given any quantile
, an
-approximate quantile is defined as an element in the data stream whose rank falls within
. In case of more than one
-approximate quantile being available, the one closest to
is returned.
Zhang Q and Wang W (2007) A fast algorithm for approximate quantiles in high speed data streams Proceedings of the 19th International Conference on Scientific and Statistical Database Management IEEE Computer Society 29
- 1:
ind – Integer *Input/Output
On entry: indicates the action required in the current call to nag_approx_quantiles_fixed (g01anc).
- Return the required length of rcomm and icomm in and respectively. n and eps must be set and licomm must be at least .
- Initialise the communication arrays and process the first nb values from the data stream as supplied in rv.
- Process the next block of nb values from the data stream. The calling program must update rv and (if required) nb, and re-enter nag_approx_quantiles_fixed (g01anc) with all other parameters unchanged.
- Calculate the nq -approximate quantiles specified in q. The calling program must set q and nq and re-enter nag_approx_quantiles_fixed (g01anc) with all other parameters unchanged. This option can be chosen only when .
On exit: indicates output from a successful call.
- Lengths of rcomm and icomm have been returned in and respectively.
- nag_approx_quantiles_fixed (g01anc) has processed np data points and expects to be called again with additional data (i.e., ).
- nag_approx_quantiles_fixed (g01anc) has returned the requested -approximate quantiles in qv. These quantiles are based on np data points.
- Routine has processed all n data points (i.e., ).
Constraint:
on entry , , or .
- 2:
n – IntegerInput
On entry: , the total number of values in the data stream.
Constraint:
.
- 3:
rv[] – const doubleInput
-
Note: the dimension,
dim, of the array
rv
must be at least
when
or
.
On entry: if
or
, the vector containing the current block of data, otherwise
rv is not referenced.
- 4:
nb – IntegerInput
On entry: if
or
, the size of the current block of data. The size of blocks of data in array
rv can vary; therefore
nb can change between calls to nag_approx_quantiles_fixed (g01anc).
Constraint:
if or , .
- 5:
eps – doubleInput
On entry: approximation factor .
Constraint:
.
- 6:
np – Integer *Output
On exit: the number of elements processed so far.
- 7:
q[] – const doubleInput
-
Note: the dimension,
dim, of the array
q
must be at least
when
.
On entry: if
, the quantiles to be calculated, otherwise
q is not referenced. Note that
, corresponds to the minimum value and
to the maximum value.
Constraint:
if , , for .
- 8:
qv[] – doubleOutput
-
Note: the dimension,
dim, of the array
qv
must be at least
when
.
On exit: if , contains the -approximate quantiles specified by the value provided in .
- 9:
nq – IntegerInput
On entry: if
, the number of quantiles requested, otherwise
nq is not referenced.
Constraint:
if , .
- 10:
rcomm[lrcomm] – doubleCommunication Array
- 11:
lrcomm – IntegerInput
On entry: the dimension of the array
rcomm.
Constraint:
if
,
lrcomm must be at least equal to the value returned in
by a call to nag_approx_quantiles_fixed (g01anc) with
. This will not be more than
, where
.
- 12:
icomm[licomm] – IntegerCommunication Array
- 13:
licomm – IntegerInput
On entry: the dimension of the array
icomm.
Constraints:
- if , ;
- otherwise licomm must be at least equal to the value returned in by a call to nag_approx_quantiles_fixed (g01anc) with . This will not be more than , where and .
- 14:
fail – NagError *Input/Output
-
The NAG error argument (see
Section 3.6 in the Essential Introduction).
- NE_ALLOC_FAIL
-
Dynamic memory allocation failed.
- NE_ARRAY_SIZE
-
On entry,
licomm is too small:
.
On entry,
lrcomm is too small:
.
- NE_BAD_PARAM
-
On entry, argument had an illegal value.
- NE_INT
-
On entry, or and .
Constraint: if or then .
On entry, and .
Constraint: if then .
On entry, .
Constraint: , , or .
On entry, .
Constraint: .
- NE_INTERNAL_ERROR
-
An internal error has occurred in this function. Check the function call and any array sizes. If the call is correct then please contact
NAG for assistance.
- NE_Q_OUT_OF_RANGE
-
On entry, and .
Constraint: if then for all .
- NE_REAL
-
On entry, .
Constraint: .
- NE_TOO_SMALL
-
Number of data elements streamed,
is not sufficient for a quantile query when
.
Supply more data or reprocess the data with a higher
eps value.
Not applicable.
nag_approx_quantiles_fixed (g01anc) is threaded by NAG for parallel execution in multithreaded implementations of the NAG Library.
Please consult the
Users' Note for your implementation for any additional implementation-specific information.