g02bu calculates the sample means and sums of squares and cross-products, or sums of squares and cross-products of deviations from the mean, in a single pass for a set of data. The data may be weighted.
Syntax
C# |
---|
public static void g02bu( string mean, string weight, int n, int m, double[,] x, double[] wt, out double sw, double[] wmean, double[] c, out int ifail ) |
Visual Basic |
---|
Public Shared Sub g02bu ( _ mean As String, _ weight As String, _ n As Integer, _ m As Integer, _ x As Double(,), _ wt As Double(), _ <OutAttribute> ByRef sw As Double, _ wmean As Double(), _ c As Double(), _ <OutAttribute> ByRef ifail As Integer _ ) |
Visual C++ |
---|
public: static void g02bu( String^ mean, String^ weight, int n, int m, array<double,2>^ x, array<double>^ wt, [OutAttribute] double% sw, array<double>^ wmean, array<double>^ c, [OutAttribute] int% ifail ) |
F# |
---|
static member g02bu : mean : string * weight : string * n : int * m : int * x : float[,] * wt : float[] * sw : float byref * wmean : float[] * c : float[] * ifail : int byref -> unit |
Parameters
- mean
- Type: System..::..StringOn entry: indicates whether g02bu is to calculate sums of squares and cross-products, or sums of squares and cross-products of deviations about the mean.
- The sums of squares and cross-products of deviations about the mean are calculated.
- The sums of squares and cross-products are calculated.
Constraint: or .
- weight
- Type: System..::..StringOn entry: indicates whether the data is weighted or not.
- The calculations are performed on unweighted data.
- The calculations are performed on weighted data.
Constraint: or .
- n
- Type: System..::..Int32On entry: , the number of observations in the dataset.Constraint: .
- m
- Type: System..::..Int32On entry: , the number of variables.Constraint: .
- x
- Type: array<System..::..Double,2>[,](,)[,][,]An array of size [dim1, m]Note: dim1 must satisfy the constraint:On entry: must contain the th observation on the th variable, for and .
- wt
- Type: array<System..::..Double>[]()[][]An array of size [dim1]Note: the dimension of the array wt must be at least if , and at least otherwise.On entry: the optional weights of each observation.If , wt is not referenced.If , must contain the weight for the th observation.Constraint: if , , for .
- sw
- Type: System..::..Double%On exit: the sum of weights.If , sw contains the number of observations, .
- wmean
- Type: array<System..::..Double>[]()[][]An array of size [m]On exit: the sample means. contains the mean for the th variable.
- c
- Type: array<System..::..Double>[]()[][]An array of size []On exit: the cross-products.If , c contains the upper triangular part of the matrix of (weighted) sums of squares and cross-products of deviations about the mean.If , c contains the upper triangular part of the matrix of (weighted) sums of squares and cross-products.These are stored packed by columns, i.e., the cross-product between the th and th variable, , is stored in .
- ifail
- Type: System..::..Int32%On exit: unless the method detects an error or a warning has been flagged (see [Error Indicators and Warnings]).
Description
g02bu is an adaptation of West's WV2 algorithm; see West (1979). This method calculates the (optionally weighted) sample means and (optionally weighted) sums of squares and cross-products or sums of squares and cross-products of deviations from the (weighted) mean for a sample of observations on variables , for . The algorithm makes a single pass through the data.
For the first observations let the mean of the th variable be , the cross-product about the mean for the th and th variables be and the sum of weights be . These are updated by the th observation, , for , with weight as follows:
and
The algorithm is initialized by taking , the first observation, and .
For the unweighted case and for all .
Note that only the upper triangle of the matrix is calculated and returned packed by column.
References
Chan T F, Golub G H and Leveque R J (1982) Updating Formulae and a Pairwise Algorithm for Computing Sample Variances Compstat, Physica-Verlag
West D H D (1979) Updating mean and variance estimates: An improved method Comm. ACM 22 532–555
Error Indicators and Warnings
Errors or warnings detected by the method:
Some error messages may refer to parameters that are dropped from this interface
(LDX) In these
cases, an error in another parameter has usually caused an incorrect value to be inferred.
On entry, , or ,
On entry, or .
On entry, or .
Accuracy
For a detailed discussion of the accuracy of this algorithm see Chan et al. (1982) or West (1979).
Parallelism and Performance
None.
Further Comments
g02bw may be used to calculate the correlation coefficients from the cross-products of deviations about the mean. The cross-products of deviations about the mean may be scaled
using (F06EDF not in this release) f06fd
to give a variance-covariance matrix.
The means and cross-products produced by g02bu may be updated by adding or removing observations using g02bt.
Two sets of means and cross-products, as produced by g02bu, can be combined using (G02BZF not in this release).
Example
A program to calculate the means, the required sums of squares and cross-products matrix, and the variance matrix for a set of observations of variables.
Example program (C#): g02bue.cs