naginterfaces.library.nonpar.rank_regsn¶
- naginterfaces.library.nonpar.rank_regsn(nv, y, x, idist, nmax, tol)[source]¶
rank_regsn
calculates the parameter estimates, score statistics and their variance-covariance matrices for the linear model using a likelihood based on the ranks of the observations.For full information please refer to the NAG Library document for g08ra
https://support.nag.com/numeric/nl/nagdoc_30.1/flhtml/g08/g08raf.html
- Parameters
- nvint, array-like, shape
The number of observations in the th sample, for .
- yfloat, array-like, shape
The observations in each sample. Specifically, must contain the th observation in the th sample.
- xfloat, array-like, shape
The design matrices for each sample. Specifically, must contain the value of the th explanatory variable for the th observation in the th sample.
- idistint
The error distribution to be used in the analysis.
Normal.
Logistic.
Extreme value.
Double-exponential.
- nmaxint
The value of the largest sample size.
- tolfloat
The tolerance for judging whether two observations are tied. Thus, observations and are adjudged to be tied if .
- Returns
- prvrfloat, ndarray, shape
The variance-covariance matrices of the score statistics and the parameter estimates, the former being stored in the upper triangle and the latter in the lower triangle. Thus for , contains an estimate of the covariance between the th and th score statistics. For , contains an estimate of the covariance between the th and th parameter estimates.
- irankint, ndarray, shape
For the one sample case, contains the ranks of the observations.
- zinfloat, ndarray, shape
For the one sample case, contains the expected values of the function of the order statistics.
- etafloat, ndarray, shape
For the one sample case, contains the expected values of the function of the order statistics.
- vapvecfloat, ndarray, shape
For the one sample case, contains the upper triangle of the variance-covariance matrix of the function of the order statistics stored column-wise.
- parestfloat, ndarray, shape
The statistics calculated by the function.
The first components of contain the score statistics.
The next elements contain the parameter estimates.
contains the value of the statistic.
The next elements of contain the standard errors of the parameter estimates.
Finally, the remaining elements of contain the -statistics.
- Raises
- NagValueError
- (errno )
On entry, and .
Constraint: .
- (errno )
On entry, and .
Constraint: .
- (errno )
On entry, elements of .
Constraint: .
- (errno )
On entry, and .
Constraint: .
- (errno )
On entry, .
Constraint: .
- (errno )
On entry, .
Constraint: .
- (errno )
On entry, .
Constraint: .
- (errno )
On entry, .
On entry, , , or .
- (errno )
On entry, all the observations were adjudged to be tied.
- (errno )
The matrix is either ill-conditioned or not positive definite.
- (errno )
On entry, for , for all .
Constraint: for at least one .
- Notes
Analysis of data can be made by replacing observations by their ranks. The analysis produces inference for regression parameters arising from the following model.
For random variables we assume that, after an arbitrary monotone increasing differentiable transformation, , the model
holds, where is a known vector of explanatory variables and is a vector of unknown regression coefficients. The are random variables assumed to be independent and identically distributed with a completely known distribution which can be one of the following: Normal, logistic, extreme value or double-exponential. In Pettitt (1982) an estimate for is proposed as with estimated variance-covariance matrix . The statistics and depend on the ranks of the observations and the density chosen for .
The matrix is the matrix of explanatory variables. It is assumed that is of rank and that a column or a linear combination of columns of is not equal to the column vector of or a multiple of it. This means that a constant term cannot be included in the model (1). The statistics and are found as follows. Let have pdf and let . Let be order statistics for a random sample of size with the density . Define , then . To define we need , where is an diagonal matrix with and is a symmetric matrix with . In the case of the Normal distribution, the are standard Normal order statistics and , for .
The analysis can also deal with ties in the data. Two observations are adjudged to be tied if , where is a user-supplied tolerance level.
Various statistics can be found from the analysis:
The score statistic . This statistic is used to test the hypothesis , see (e).
The estimated variance-covariance matrix of the score statistic in (a).
The estimate .
The estimated variance-covariance matrix of the estimate .
The statistic used to test . Under , has an approximate -distribution with degrees of freedom.
The standard errors of the estimates given in (c).
Approximate -statistics, i.e., for testing . For , has an approximate distribution.
In many situations, more than one sample of observations will be available. In this case we assume the model
where is the number of samples. In an obvious manner, and are the vector of observations and the design matrix for the th sample respectively. Note that the arbitrary transformation can be assumed different for each sample since observations are ranked within the sample.
The earlier analysis can be extended to give a combined estimate of as , where
and
with , and defined as , and above but for the th sample.
The remaining statistics are calculated as for the one sample case.
- References
Pettitt, A N, 1982, Inference for the linear model using a likelihood based on ranks, J. Roy. Statist. Soc. Ser. B (44), 234–243