NAG FL Interface
g08raf (rank_regsn)
1
Purpose
g08raf calculates the parameter estimates, score statistics and their variance-covariance matrices for the linear model using a likelihood based on the ranks of the observations.
2
Specification
Fortran Interface
Subroutine g08raf ( |
ns, nv, nsum, y, ip, x, ldx, idist, nmax, tol, prvr, ldprvr, irank, zin, eta, vapvec, parest, work, lwork, iwa, ifail) |
Integer, Intent (In) |
:: |
ns, nsum, ip, ldx, idist, nmax, ldprvr, lwork |
Integer, Intent (Inout) |
:: |
nv(ns), ifail |
Integer, Intent (Out) |
:: |
irank(nmax), iwa(0) |
Real (Kind=nag_wp), Intent (In) |
:: |
y(nsum), x(ldx,ip), tol |
Real (Kind=nag_wp), Intent (Inout) |
:: |
prvr(ldprvr,ip) |
Real (Kind=nag_wp), Intent (Out) |
:: |
zin(nmax), eta(nmax), vapvec(nmax*(nmax+1)/2), parest(4*ip+1), work(0) |
|
C Header Interface
#include <nag.h>
void |
g08raf_ (const Integer *ns, Integer nv[], const Integer *nsum, const double y[], const Integer *ip, const double x[], const Integer *ldx, const Integer *idist, const Integer *nmax, const double *tol, double prvr[], const Integer *ldprvr, Integer irank[], double zin[], double eta[], double vapvec[], double parest[], double work[], const Integer *lwork, Integer iwa[], Integer *ifail) |
|
C++ Header Interface
#include <nag.h> extern "C" {
void |
g08raf_ (const Integer &ns, Integer nv[], const Integer &nsum, const double y[], const Integer &ip, const double x[], const Integer &ldx, const Integer &idist, const Integer &nmax, const double &tol, double prvr[], const Integer &ldprvr, Integer irank[], double zin[], double eta[], double vapvec[], double parest[], double work[], const Integer &lwork, Integer iwa[], Integer &ifail) |
}
|
The routine may be called by the names g08raf or nagf_nonpar_rank_regsn.
3
Description
Analysis of data can be made by replacing observations by their ranks. The analysis produces inference for regression parameters arising from the following model.
For random variables
we assume that, after an arbitrary monotone increasing differentiable transformation,
, the model
holds, where
is a known vector of explanatory variables and
is a vector of
unknown regression coefficients. The
are random variables assumed to be independent and identically distributed with a completely known distribution which can be one of the following: Normal, logistic, extreme value or double-exponential. In
Pettitt (1982) an estimate for
is proposed as
with estimated variance-covariance matrix
. The statistics
and
depend on the ranks
of the observations
and the density chosen for
.
The matrix
is the
by
matrix of explanatory variables. It is assumed that
is of rank
and that a column or a linear combination of columns of
is not equal to the column vector of
or a multiple of it. This means that a constant term cannot be included in the model
(1). The statistics
and
are found as follows. Let
have pdf
and let
. Let
be order statistics for a random sample of size
with the density
. Define
, then
. To define
we need
, where
is an
by
diagonal matrix with
and
is a symmetric matrix with
. In the case of the Normal distribution, the
are standard Normal order statistics and
, for
.
The analysis can also deal with ties in the data. Two observations are adjudged to be tied if
, where
tol is a user-supplied tolerance level.
Various statistics can be found from the analysis:
-
(a)The score statistic . This statistic is used to test the hypothesis , see (e).
-
(b)The estimated variance-covariance matrix of the score statistic in (a).
-
(c)The estimate .
-
(d)The estimated variance-covariance matrix of the estimate .
-
(e)The statistic used to test . Under , has an approximate -distribution with degrees of freedom.
-
(f)The standard errors of the estimates given in (c).
-
(g)Approximate -statistics, i.e., for testing . For , has an approximate distribution.
In many situations, more than one sample of observations will be available. In this case we assume the model
where
ns is the number of samples. In an obvious manner,
and
are the vector of observations and the design matrix for the
th sample respectively. Note that the arbitrary transformation
can be assumed different for each sample since observations are ranked within the sample.
The earlier analysis can be extended to give a combined estimate of
as
, where
and
with
,
and
defined as
,
and
above but for the
th sample.
The remaining statistics are calculated as for the one sample case.
4
References
Pettitt A N (1982) Inference for the linear model using a likelihood based on ranks J. Roy. Statist. Soc. Ser. B 44 234–243
5
Arguments
-
1:
– Integer
Input
-
On entry: the number of samples.
Constraint:
.
-
2:
– Integer array
Input
-
On entry: the number of observations in the
th sample, for .
Constraint:
, for .
-
3:
– Integer
Input
-
On entry: the total number of observations.
Constraint:
.
-
4:
– Real (Kind=nag_wp) array
Input
-
On entry: the observations in each sample. Specifically, must contain the th observation in the th sample.
-
5:
– Integer
Input
-
On entry: the number of parameters to be fitted.
Constraint:
.
-
6:
– Real (Kind=nag_wp) array
Input
-
On entry: the design matrices for each sample. Specifically, must contain the value of the th explanatory variable for the th observation in the th sample.
Constraint:
must not contain a column with all elements equal.
-
7:
– Integer
Input
-
On entry: the first dimension of the array
x as declared in the (sub)program from which
g08raf is called.
Constraint:
.
-
8:
– Integer
Input
-
On entry: the error distribution to be used in the analysis.
- Normal.
- Logistic.
- Extreme value.
- Double-exponential.
Constraint:
.
-
9:
– Integer
Input
-
On entry: the value of the largest sample size.
Constraint:
and .
-
10:
– Real (Kind=nag_wp)
Input
-
On entry: the tolerance for judging whether two observations are tied. Thus, observations and are adjudged to be tied if .
Constraint:
.
-
11:
– Real (Kind=nag_wp) array
Output
-
On exit: the variance-covariance matrices of the score statistics and the parameter estimates, the former being stored in the upper triangle and the latter in the lower triangle. Thus for , contains an estimate of the covariance between the th and th score statistics. For , contains an estimate of the covariance between the th and th parameter estimates.
-
12:
– Integer
Input
-
On entry: the first dimension of the array
prvr as declared in the (sub)program from which
g08raf is called.
Constraint:
.
-
13:
– Integer array
Output
-
On exit: for the one sample case,
irank contains the ranks of the observations.
-
14:
– Real (Kind=nag_wp) array
Output
-
On exit: for the one sample case,
zin contains the expected values of the function
of the order statistics.
-
15:
– Real (Kind=nag_wp) array
Output
-
On exit: for the one sample case,
eta contains the expected values of the function
of the order statistics.
-
16:
– Real (Kind=nag_wp) array
Output
-
On exit: for the one sample case,
vapvec contains the upper triangle of the variance-covariance matrix of the function
of the order statistics stored column-wise.
-
17:
– Real (Kind=nag_wp) array
Output
-
On exit: the statistics calculated by the routine.
The first
ip components of
parest contain the score statistics.
The next
ip elements contain the parameter estimates.
contains the value of the statistic.
The next
ip elements of
parest contain the standard errors of the parameter estimates.
Finally, the remaining
ip elements of
parest contain the
-statistics.
-
18:
– Real (Kind=nag_wp) array
Output
-
19:
– Integer
Input
-
20:
– Integer array
Output
-
On entry: are no longer required by g08raf but is retained for backwards compatibility.
-
21:
– Integer
Input/Output
-
On entry:
ifail must be set to
,
. If you are unfamiliar with this argument you should refer to
Section 4 in the Introduction to the NAG Library FL Interface for details.
For environments where it might be inappropriate to halt program execution when an error is detected, the value
is recommended. If the output of error messages is undesirable, then the value
is recommended. Otherwise, if you are not familiar with this argument, the recommended value is
.
When the value is used it is essential to test the value of ifail on exit.
On exit:
unless the routine detects an error or a warning has been flagged (see
Section 6).
6
Error Indicators and Warnings
If on entry
or
, explanatory error messages are output on the current error message unit (as defined by
x04aaf).
Errors or warnings detected by the routine:
-
On entry, elements of .
Constraint: .
On entry, .
Constraint: .
On entry, and .
Constraint: .
On entry, and .
Constraint: .
On entry, and .
Constraint: .
On entry, and .
Constraint: .
On entry, .
Constraint: .
On entry, .
Constraint: .
On entry, and .
Constraint: .
-
On entry, .
On entry, , , or .
-
On entry, all the observations were adjudged to be tied. You are advised to check the value supplied for
tol.
-
The matrix is either ill-conditioned or not positive definite. This error should only occur with extreme rankings of the data.
-
On entry, for , for all .
Constraint: for at least one .
An unexpected error has been triggered by this routine. Please
contact
NAG.
See
Section 7 in the Introduction to the NAG Library FL Interface for further information.
Your licence key may have expired or may not have been installed correctly.
See
Section 8 in the Introduction to the NAG Library FL Interface for further information.
Dynamic memory allocation failed.
See
Section 9 in the Introduction to the NAG Library FL Interface for further information.
7
Accuracy
The computations are believed to be stable.
8
Parallelism and Performance
g08raf is threaded by NAG for parallel execution in multithreaded implementations of the NAG Library.
g08raf makes calls to BLAS and/or LAPACK routines, which may be threaded within the vendor library used by this implementation. Consult the documentation for the vendor library for further information.
Please consult the
X06 Chapter Introduction for information on how to control and interrogate the OpenMP environment used within this routine. Please also consult the
Users' Note for your implementation for any additional implementation-specific information.
The time taken by g08raf depends on the number of samples, the total number of observations and the number of parameters fitted.
In extreme cases the parameter estimates for certain models can be infinite, although this is unlikely to occur in practice. See
Pettitt (1982) for further details.
10
Example
A program to fit a regression model to a single sample of observations using two explanatory variables. The error distribution will be taken to be logistic.
10.1
Program Text
10.2
Program Data
10.3
Program Results