g02qg performs a multiple linear quantile regression. Parameter estimates and, if required, confidence limits, covariance matrices and residuals are calculated. g02qg may be used to perform a weighted quantile regression. A simplified interface for g02qg is provided by g02qf.

Syntax

C#
public static void g02qg( int sorder, int ic1, int n, int m, double[,] dat, int[] isx, int ip, double[] y, double[] wt, int ntau, double[] tau, out double df, double[,] b, double[,] bl, double[,] bu, double[,,] ch, double[,] res, G02..::..g02qgOptions options, G05..::..G05State g05state, int[] info, out int ifail )

public static void g02qg(
	int sorder,
	int ic1,
	int n,
	int m,
	double[,] dat,
	int[] isx,
	int ip,
	double[] y,
	double[] wt,
	int ntau,
	double[] tau,
	out double df,
	double[,] b,
	double[,] bl,
	double[,] bu,
	double[,,] ch,
	double[,] res,
	G02..::..g02qgOptions options,
	G05..::..G05State g05state,
	int[] info,
	out int ifail
)

Visual Basic
Public Shared Sub g02qg ( _ sorder As Integer, _ ic1 As Integer, _ n As Integer, _ m As Integer, _ dat As Double(,), _ isx As Integer(), _ ip As Integer, _ y As Double(), _ wt As Double(), _ ntau As Integer, _ tau As Double(), _ <OutAttribute> ByRef df As Double, _ b As Double(,), _ bl As Double(,), _ bu As Double(,), _ ch As Double(,,), _ res As Double(,), _ options As G02..::..g02qgOptions, _ g05state As G05..::..G05State, _ info As Integer(), _ <OutAttribute> ByRef ifail As Integer _ )

Visual Basic

Public Shared Sub g02qg ( _
	sorder As Integer, _
	ic1 As Integer, _
	n As Integer, _
	m As Integer, _
	dat As Double(,), _
	isx As Integer(), _
	ip As Integer, _
	y As Double(), _
	wt As Double(), _
	ntau As Integer, _
	tau As Double(), _
	<OutAttribute> ByRef df As Double, _
	b As Double(,), _
	bl As Double(,), _
	bu As Double(,), _
	ch As Double(,,), _
	res As Double(,), _
	options As G02..::..g02qgOptions, _
	g05state As G05..::..G05State, _
	info As Integer(), _
	<OutAttribute> ByRef ifail As Integer _
)

Visual C++
public: static void g02qg( int sorder, int ic1, int n, int m, array<double,2>^ dat, array<int>^ isx, int ip, array<double>^ y, array<double>^ wt, int ntau, array<double>^ tau, [OutAttribute] double% df, array<double,2>^ b, array<double,2>^ bl, array<double,2>^ bu, array<double,3>^ ch, array<double,2>^ res, G02..::..g02qgOptions^ options, G05..::..G05State^ g05state, array<int>^ info, [OutAttribute] int% ifail )

Visual C++

public:
static void g02qg(
	int sorder, 
	int ic1, 
	int n, 
	int m, 
	array<double,2>^ dat, 
	array<int>^ isx, 
	int ip, 
	array<double>^ y, 
	array<double>^ wt, 
	int ntau, 
	array<double>^ tau, 
	[OutAttribute] double% df, 
	array<double,2>^ b, 
	array<double,2>^ bl, 
	array<double,2>^ bu, 
	array<double,3>^ ch, 
	array<double,2>^ res, 
	G02..::..g02qgOptions^ options, 
	G05..::..G05State^ g05state, 
	array<int>^ info, 
	[OutAttribute] int% ifail
)

F#
static member g02qg : sorder : int * ic1 : int * n : int * m : int * dat : float[,] * isx : int[] * ip : int * y : float[] * wt : float[] * ntau : int * tau : float[] * df : float byref * b : float[,] * bl : float[,] * bu : float[,] * ch : float[,,] * res : float[,] * options : G02..::..g02qgOptions * g05state : G05..::..G05State * info : int[] * ifail : int byref -> unit

static member g02qg : 
        sorder : int * 
        ic1 : int * 
        n : int * 
        m : int * 
        dat : float[,] * 
        isx : int[] * 
        ip : int * 
        y : float[] * 
        wt : float[] * 
        ntau : int * 
        tau : float[] * 
        df : float byref * 
        b : float[,] * 
        bl : float[,] * 
        bu : float[,] * 
        ch : float[,,] * 
        res : float[,] * 
        options : G02..::..g02qgOptions * 
        g05state : G05..::..G05State * 
        info : int[] * 
        ifail : int byref -> unit

Parameters

sorder: Type: System..::..Int32
On entry: determines the storage order of variates supplied in dat.

Constraint: $sorder = 1$ or $2$ .

ic1

Type: System..::..Int32

On entry: indicates whether an intercept will be included in the model. The intercept is included by adding a column of ones as the first column in the design matrix,

X

$ic1 = 1$: An intercept will be included in the model.
$ic1 = 0$: An intercept will not be included in the model.

Constraint:

ic1 = 0

1

n: Type: System..::..Int32
On entry: the total number of observations in the dataset. If no weights are supplied, or no zero weights are supplied or observations with zero weights are included in the model then $n = n$ . Otherwise $n = n +$ the number of observations with zero weights.

Constraint: $n \geq 2$ .

m: Type: System..::..Int32
On entry: $m$ , the total number of variates in the dataset.

Constraint: $m \geq 0$ .

dat

Type: array<System..::..Double,2>[,](,)[,][,]

An array of size [dim1, _sddat]

Note: dim1 must satisfy the constraint:

if $sorder = 1$ , $dim1 \geq n$ ;
otherwise $dim1 \geq m$ .

On entry: the

i

th value for the

j

th variate, for

i = 1, 2, \dots, n

and

j = 1, 2, \dots, m

, must be supplied in

$dat [i - 1, j - 1]$ if $sorder = 1$ , and
$dat [j - 1, i - 1]$ if $sorder = 2$ .

The design matrix

X

is constructed from dat, isx and ic1.

isx

Type: array<System..::..Int32>[]()[][]

An array of size [m]

On entry: indicates which independent variables are to be included in the model.

$isx [j - 1] = 0$: The $j$ th variate, supplied in dat, is not included in the regression model.
$isx [j - 1] = 1$: The $j$ th variate, supplied in dat, is included in the regression model.

Constraints:

$isx [j - 1] = 0$ or $1$ , for $j = 1, 2, \dots, m$ ;
if $ic1 = 1$ , exactly $ip - 1$ values of isx must be set to $1$ ;
if $ic1 = 0$ , exactly ip values of isx must be set to $1$ .

ip

Type: System..::..Int32

On entry:

p

, the number of independent variables in the model, including the intercept, see ic1, if present.

Constraints:

$1 \leq ip < n$ ;
if $ic1 = 1$ , $1 \leq ip \leq m + 1$ ;
if $ic1 = 0$ , $1 \leq ip \leq m$ .

y: Type: array<System..::..Double>[]()[][]
An array of size [n]
On entry: $y$ , observations on the dependent variable.

wt

Type: array<System..::..Double>[]()[][]

An array of size [_lwt]

On entry: if

_weight = "W"

, wt must contain the diagonal elements of the weight matrix

W

. Otherwise wt is not referenced.

When

$Drop Zero Weights ='YES'$: If $wt [i - 1] = 0.0$ , the $i$ th observation is not included in the model, in which case the effective number of observations, $n$ , is the number of observations with nonzero weights. If $Return Residuals ='YES'$ , the values of res will be set to zero for observations with zero weights.
$Drop Zero Weights ='NO'$: All observations are included in the model and the effective number of observations is n, i.e., $n = n$ .

Constraints:

If $_weight = "W"$ , $wt [i - 1] \geq 0.0$ , for $i = 1, 2, \dots, n$ ;
The effective number of observations $\geq 2$ .

ntau: Type: System..::..Int32
On entry: the number of quantiles of interest.

Constraint: $ntau \geq 1$ .

tau: Type: array<System..::..Double>[]()[][]
An array of size [ntau]
On entry: the vector of quantiles of interest. A separate model is fitted to each quantile.

Constraint: $\sqrt{ε} < tau [j - 1] < 1 - \sqrt{ε}$ where $ε$ is the machine precision returned by x02aj, for $j = 1, 2, \dots, ntau$ .

df: Type: System..::..Double%
On exit: the degrees of freedom given by $n - k$ , where $n$ is the effective number of observations and $k$ is the rank of the cross-product matrix $X^{T} X$ .

b: Type: array<System..::..Double,2>[,](,)[,][,]
An array of size [ip, ntau]
On entry: if $Calculate Initial Values ='NO'$ , $b [i - 1, l - 1]$ must hold an initial estimates for ${\hat{β}}_{i}$ , for $i = 1, 2, \dots, ip$ and $l = 1, 2, \dots, ntau$ . If $Calculate Initial Values ='YES'$ , b need not be set.
On exit: $b [i - 1, l - 1]$ , for $i = 1, 2, \dots, ip$ , contains the estimates of the parameters of the regression model, $\hat{β}$ , estimated for $τ = tau [l - 1]$ .
If $ic1 = 1$ , $b [0, l - 1]$ will contain the estimate corresponding to the intercept and $b [i, l - 1]$ will contain the coefficient of the $j$ th variate contained in dat, where $isx [j - 1]$ is the $i$ th nonzero value in the array isx.

If $ic1 = 0$ , $b [i - 1, l - 1]$ will contain the coefficient of the $j$ th variate contained in dat, where $isx [j - 1]$ is the $i$ th nonzero value in the array isx.

bl: Type: array<System..::..Double,2>[,](,)[,][,]
An array of size [dim1, ntau]
Note: dim1 must satisfy the constraint:
Note: the second dimension of the array bl must be at least $ntau$ if $Interval Method \neq'NONE'$ .
On exit: if $Interval Method \neq'NONE'$ , $bl [i - 1, l - 1]$ contains the lower limit of an $(100 \times α) %$ confidence interval for $b [i - 1, l - 1]$ , for $i = 1, 2, \dots, ip$ and $l = 1, 2, \dots, ntau$ .
If $Interval Method ='NONE'$ , bl is not referenced.

The method used for calculating the interval is controlled by the optional parameters Interval Method and Bootstrap Interval Method. The size of the interval, $α$ , is controlled by the optional parameter Significance Level.

bu: Type: array<System..::..Double,2>[,](,)[,][,]
An array of size [dim1, ntau]
Note: dim1 must satisfy the constraint:
Note: the second dimension of the array bu must be at least $ntau$ if $Interval Method \neq'NONE'$ .
On exit: if $Interval Method \neq'NONE'$ , $bu [i - 1, l - 1]$ contains the upper limit of an $(100 \times α) %$ confidence interval for $b [i - 1, l - 1]$ , for $i = 1, 2, \dots, ip$ and $l = 1, 2, \dots, ntau$ .
If $Interval Method ='NONE'$ , bu is not referenced.

The method used for calculating the interval is controlled by the optional parameters Interval Method and Bootstrap Interval Method. The size of the interval, $α$ is controlled by the optional parameter Significance Level.

ch

Type: array<System..::..Double,3>[,](,)[,][,]

An array of size [dim1, dim2, dim3]

Note: dim1 must satisfy the constraint:

Note: dim2 must satisfy the constraint:

Note: dim3 must satisfy the constraint:

Note: the last dimension of the array ch must be at least

ntau

Interval Method \neq'NONE'

and

Matrix Returned ='COVARIANCE'

and at least

ntau + 1

Interval Method \neq'NONE'

'IID'

'BOOTSTRAP XY'

and

Matrix Returned ='H INVERSE'

On exit: depending on the supplied optional parameters, ch will either not be referenced, hold an estimate of the upper triangular part of the covariance matrix,

Σ

, or an estimate of the upper triangular parts of

n J_{n}

and

n^{- 1} H_{n}^{- 1}

Interval Method ='NONE'

Matrix Returned ='NONE'

, ch is not referenced.

Interval Method ='BOOTSTRAP XY'

'IID'

and

Matrix Returned ='H INVERSE'

, ch is not referenced.

Otherwise, for

i, j = 1, 2, \dots, ip, j \geq i

and

l = 1, 2, \dots, ntau

If $Matrix Returned ='COVARIANCE'$ , $ch [i - 1, j - 1, l - 1]$ holds an estimate of the covariance between $b [i - 1, l - 1]$ and $b [j - 1, l - 1]$ .
If $Matrix Returned ='H INVERSE'$ , $ch [i - 1, j - 1, 0]$ holds an estimate of the $(i, j)$ th element of $n J_{n}$ and $ch [i - 1, j - 1, l + 1 - 1]$ holds an estimate of the $(i, j)$ th element of $n^{- 1} H_{n}^{- 1}$ , for $τ = tau [l - 1]$ .

The method used for calculating

Σ

and

H_{n}^{- 1}

is controlled by the optional parameter Interval Method.

res: Type: array<System..::..Double,2>[,](,)[,][,]
An array of size [n, dim2]
Note: dim2 must satisfy the constraint:
On exit: if $Return Residuals ='YES'$ , $res [i - 1, l - 1]$ holds the (weighted) residuals, $r_{i}$ , for $τ = tau [l - 1]$ , for $i = 1, 2, \dots, n$ and $l = 1, 2, \dots, ntau$ .
If $wt is not NULL$ and $Drop Zero Weights ='YES'$ , the value of res will be set to zero for observations with zero weights.

If $Return Residuals ='NO'$ , res is not referenced.

options: Type: NagLibrary..::..G02..::..g02qgOptions
An Object of type G02.g02qgOptions. Used to configure optional parameters to this method.

g05state: Type: NagLibrary..::..G05..::..G05State
An Object of type G05.G05State.

info

Type: array<System..::..Int32>[]()[][]

An array of size [

ntau

]

On exit:

info [i]

holds additional information concerning the model fitting and confidence limit calculations when

τ = tau [i]

Code	Warning
$0$	Model fitted and confidence limits (if requested) calculated successfully
$1$	The method did not converge. The returned values are based on the estimate at the last iteration. Try increasing Iteration Limit whilst calculating the parameter estimates or relaxing the definition of convergence by increasing Tolerance.
$2$	A singular matrix was encountered during the optimization. The model was not fitted for this value of $τ$ .
$4$	Some truncation occurred whilst calculating the confidence limits for this value of $τ$ . See [Algorithmic Details] for details. The returned upper and lower limits may be narrower than specified.
$8$	The method did not converge whilst calculating the confidence limits. The returned limits are based on the estimate at the last iteration. Try increasing Iteration Limit.
$16$	Confidence limits for this value of $τ$ could not be calculated. The returned upper and lower limits are set to a large positive and large negative value respectively as defined by the optional parameter Big.

It is possible for multiple warnings to be applicable to a single model. In these cases the value returned in info is the sum of the corresponding individual nonzero warning codes.

ifail: Type: System..::..Int32%
On exit: $ifail = 0$ unless the method detects an error or a warning has been flagged (see [Error Indicators and Warnings]).

Description

Given a vector of

n

observed values,

y = \{y_{i} : i = 1, 2, \dots, n\}

, an

n \times p

design matrix

X

, a column vector,

x

, of length

p

holding the

i

th row of

X

and a quantile

τ \in (0, 1)

, g02qg estimates the

p

-element vector

β

as the solution to

\underset{β \in ℝ^{p}}{minimize} \sum_{i = 1}^{n} ρ_{τ} (y_{i} - x_{i}^{T} β)

(1)

where

ρ_{τ}

is the piecewise linear loss function

ρ_{τ} (z) = z (τ - I (z < 0))

, and

I (z < 0)

is an indicator function taking the value

1

z < 0

and

0

otherwise. Weights can be incorporated by replacing

X

and

y

with

W X

and

W y

respectively, where

W

is an

n \times n

diagonal matrix. Observations with zero weights can either be included or excluded from the analysis; this is in contrast to least squares regression where such observations do not contribute to the objective function and are therefore always dropped.

g02qg uses the interior point algorithm of Portnoy and Koenker (1997), described briefly in [Algorithmic Details], to obtain the parameter estimates

\hat{β}

, for a given value of

τ

Under the assumption of Normally distributed errors, Koenker (2005) shows that the limiting covariance matrix of

\hat{β} - β

has the form

Σ = \frac{τ (1 - τ)}{n} {H_{n}}^{- 1} J_{n} {H_{n}}^{- 1}

where

J_{n} = n^{- 1} \sum_{i = 1}^{n} x_{i} x_{i}^{T}

and

H_{n}

is a function of

τ

, as described below. Given an estimate of the covariance matrix,

\hat{Σ}

, lower (

{\hat{β}}_{L}

) and upper (

{\hat{β}}_{U}

) limits for an

(100 \times α) %

confidence interval can be calculated for each of the

p

parameters, via

{\hat{β}}_{L i} = {\hat{β}}_{i} - t_{n - p, (1 + α) / 2} \sqrt{{\hat{Σ}}_{i i}}, {\hat{β}}_{U i} = {\hat{β}}_{i} + t_{n - p, (1 + α) / 2} \sqrt{{\hat{Σ}}_{i i}}

where

t_{n - p, 0.975}

is the

97.5

percentile of the Student's

t

distribution with

n - k

degrees of freedom, where

k

is the rank of the cross-product matrix

X^{T} X

Four methods for estimating the covariance matrix,

Σ

, are available:

(i)

Independent, identically distributed (IID) errors

Under an assumption of IID errors the asymptotic relationship for

Σ

simplifies to

Σ = \frac{τ (1 - τ)}{n} {(s (τ))}^{2} {(X^{T} X)}^{- 1}

where

s

is the sparsity function. g02qg estimates

s (τ)

from the residuals,

r_{i} = y_{i} - x_{i}^{T} \hat{β}

and a bandwidth

h_{n}

(ii)

Powell Sandwich

Powell (1991) suggested estimating the matrix

H_{n}

by a kernel estimator of the form

{\hat{H}}_{n} = {(n c_{n})}^{- 1} \sum_{i = 1}^{n} K (\frac{r_{i}}{c_{n}}) x_{i} x_{i}^{T}

where

K

is a kernel function and

c_{n}

satisfies

\lim_{n \to \infty} c_{n} \to 0

and

\lim_{n \to \infty} \sqrt{n} c_{n} \to \infty

. When the Powell method is chosen, g02qg uses a Gaussian kernel (i.e.,

K = ϕ

) and sets

c_{n} = \min (σ_{r}, (q_{r 3} - q_{r 1}) / 1.34) \times (Φ^{- 1} (τ + h_{n}) - Φ^{- 1} (τ - h_{n}))

where

h_{n}

is a bandwidth,

σ_{r}, q_{r 1}

and

q_{r 3}

are, respectively, the standard deviation and the

25 %

and

75 %

quantiles for the residuals,

r_{i}

(iii)

Hendricks–Koenker Sandwich

Koenker (2005) suggested estimating the matrix

H_{n}

using

{\hat{H}}_{n} = n^{- 1} \sum_{i = 1}^{n} [\frac{2 h_{n}}{x_{i}^{T} (\hat{β} (τ + h_{n}) - \hat{β} (τ - h_{n}))}] x_{i} x_{i}^{T}

where

h_{n}

is a bandwidth and

\hat{β} (τ + h_{n})

denotes the parameter estimates obtained from a quantile regression using the

(τ + h_{n})

th quantile. Similarly with

\hat{β} (τ - h_{n})

(iv)

Bootstrap

The last method uses bootstrapping to either estimate a covariance matrix or obtain confidence intervals for the parameter estimates directly. This method therefore does not assume Normally distributed errors. Samples of size

n

are taken from the paired data

\{y_{i}, x_{i}\}

(i.e., the independent and dependent variables are sampled together). A quantile regression is then fitted to each sample resulting in a series of bootstrap estimates for the model parameters,

β

. A covariance matrix can then be calculated directly from this series of values. Alternatively, confidence limits,

{\hat{β}}_{L}

and

{\hat{β}}_{U}

, can be obtained directly from the

(1 - α) / 2

and

(1 + α) / 2

sample quantiles of the bootstrap estimates.

Further details of the algorithms used to calculate the covariance matrices can be found in [Algorithmic Details].

All three asymptotic estimates of the covariance matrix require a bandwidth,

h_{n}

. Two alternative methods for determining this are provided:

(i)

Sheather–Hall

h_{n} = {(\frac{1.5 {(Φ^{- 1} (α_{b}) ϕ (Φ^{- 1} (τ)))}^{2}}{n (2 Φ^{- 1} (τ) + 1)})}^{\frac{1}{3}}

for a user-supplied value

α_{b}

(ii)

Bofinger

h_{n} = {(\frac{4.5 {(ϕ (Φ^{- 1} (τ)))}^{4}}{n {(2 Φ^{- 1} (τ) + 1)}^{2}})}^{\frac{1}{5}}

g02qg allows optional arguments to be supplied via the iopts and opts arrays (see [Optional Parameters] for details of the available options). Prior to calling g02qg the optional parameter arrays, iopts and opts must be initialized by calling (G02ZKF not in this release) with optstr set to

Initialize = g02qg

(see [Optional Parameters] for details on the available options). If bootstrap confidence limits are required (

Interval Method ='BOOTSTRAP XY'

) then one of the random number initialization methods (G05KFF not in this release) (for a repeatable analysis) or (G05KGF not in this release) (for an unrepeatable analysis) must also have been previously called.

References

Koenker R (2005) Quantile Regression Econometric Society Monographs, Cambridge University Press, New York

Mehrotra S (1992) On the implementation of a primal-dual interior point method SIAM J. Optim. 2 575–601

Nocedal J and Wright S J (1999) Numerical Optimization Springer Series in Operations Research, Springer, New York

Portnoy S and Koenker R (1997) The Gaussian hare and the Laplacian tortoise: computability of squared-error versus absolute error estimators Statistical Science 4 279–300

Powell J L (1991) Estimation of monotonic regression models under quantile restrictions Nonparametric and Semiparametric Methods in Econometrics Cambridge University Press, Cambridge

Error Indicators and Warnings

Errors or warnings detected by the method:

Some error messages may refer to parameters that are dropped from this interface (LDDAT, RIP, TDCH, SDRES, LIOPTS, LOPTS, LSTATE) In these cases, an error in another parameter has usually caused an incorrect value to be inferred.

$ifail = 11$: On entry, $sorder \neq 1$ or $2$ .

$ifail = 21$: On entry, $ic1 \neq 1$ or $0$ .

$ifail = 31$: On entry, $_weight \neq "U"$ or $"W"$ .

$ifail = 41$: On entry, $n < 2$ .

$ifail = 51$: On entry, $m < 0$ .

$ifail = 71$: On entry, $sorder = 1$ , $lddat < n$ .

$ifail = 72$: On entry, $sorder = 2$ , $lddat < m$ .

$ifail = 81$: On entry, $isx [j - 1] \neq 0$ or $1$ .

$ifail = 91$: On entry, $ip < 1$ or $ip \geq n$ .

$ifail = 92$: On entry, ip is not consistent with isx and ic1.

$ifail = 111$: On entry, $_weight = "W"$ and $wt [i - 1] < 0.0$ for at least one $i$ .

$ifail = 112$: On entry, the effective number of observations is less than two.

$ifail = 121$: On entry, $ntau < 1$ .

$ifail = 131$: On entry, tau is invalid.

$ifail = 201$: On entry, one or more of the optional parameter arrays iopts and opts have not been initialized or have been corrupted.

$ifail = 221$: On entry, $Interval Method ='BOOTSTRAP XY'$ and state was not initialized or has been corrupted.

$ifail = 231$: On exit, problems were encountered whilst fitting at least one model. Additional information has been returned in info.

$ifail = -4000$: Invalid dimension for array $〈value〉$
$ifail = -8000$: Negative dimension for array $〈value〉$
$ifail = -6000$: Invalid Parameters $〈value〉$

Accuracy

Not applicable.

Parallelism and Performance

None.

Further Comments

g02qg allocates internally approximately the following elements of real storage:

13 n + n p + 3 p^{2} + 6 p + 3 (p + 1) \times ntau

. If

Interval Method ='BOOTSTRAP XY'

then a further

n p

elements are required, and this increases by

p \times ntau \times Bootstrap Iterations

Bootstrap Interval Method ='QUANTILE'

. Where possible, any user-supplied output arrays are used as workspace and so the amount actually allocated may be less. If

sorder = 2

wt is NULL

ic1 = 0

and

ip = m

an internal copy of the input data is avoided and the amount of locally allocated memory is reduced by

n p

Example

A quantile regression model is fitted to Engels 1857 study of household expenditure on food. The model regresses the dependent variable, household food expenditure, against two explanatory variables, a column of ones and household income. The model is fit for five different values of

τ

and the covariance matrix is estimated assuming Normal IID errors. Both the covariance matrix and the residuals are returned.

Example program (C#): g02qge.cs

Example program data: g02qge.d

Example program results: g02qge.r

Algorithmic Details

By the addition of slack variables the minimization (1) can be reformulated into the linear programming problem

\underset{(u, v, β) \in ℝ_{+}^{n} \times ℝ_{+}^{n} \times ℝ^{p}}{minimize} τ e^{T} u + (1 - τ) e^{T} v ​   subject to y = X β + u - v

(2)

and its associated dual

\underset{d}{maximize} y^{T} d ​   subject to X^{T} d = 0, d \in {[τ - 1, τ]}^{n}

(3)

where

e

is a vector of

n

1

s. Setting

a = d + (1 - τ) e

gives the equivalent formulation

\underset{a}{maximize} y^{T} a ​   subject to X^{T} a = (1 - τ) X^{T} e, a \in {[0, 1]}^{n} .

(4)

The algorithm introduced by Portnoy and Koenker (1997) and used by g02qg, uses the primal-dual formulation expressed in equations (2) and (4) along with a logarithmic barrier function to obtain estimates for

β

. The algorithm is based on the predictor-corrector algorithm of Mehrotra (1992) and further details can be obtained from Portnoy and Koenker (1997) and Koenker (2005). A good description of linear programming, interior point algorithms, barrier functions and Mehrotra's predictor-corrector algorithm can be found in Nocedal and Wright (1999).

Interior Point Algorithm

In this section a brief description of the interior point algorithm used to estimate the model parameters is presented. It should be noted that there are some differences in the equations given here – particularly (7) and (9) – compared to those given in Koenker (2005) and Portnoy and Koenker (1997).

Central path

Rather than optimize (4) directly, an additional slack variable

s

is added and the constraint

a \in {[0, 1]}^{n}

is replaced with

a + s = e, a_{i} \geq 0, s_{i} \geq 0

, for

i = 1, 2, \dots, n

The positivity constraint on

a

and

s

is handled using the logarithmic barrier function

B (a, s, μ) = y^{T} a + μ \sum_{i = 1}^{n} (\log a_{i} + \log s_{i}) .

The primal-dual form of the problem is used giving the Lagrangian

L (a, s, β, u, μ) = B (a, s, μ) - β^{T} (X^{T} a - (1 - τ) X^{T} e) - u^{T} (a + s - e)

whose central path is described by the following first order conditions

\begin{matrix} X^{T} a & = & (1 - τ) X^{T} e \\ a + s & = & e \\ X β + u - v & = & y \\ S U e & = & μ e \\ A V e & = & μ e \end{matrix}

(5)

where

A

denotes the diagonal matrix with diagonal elements given by

a

, similarly with

S, U

and

V

. By enforcing the inequalities on

s

and

a

strictly, i.e.,

a_{i} > 0

and

s_{i} > 0

for all

i

we ensure that

A

and

S

are positive definite diagonal matrices and hence

A^{- 1}

and

S^{- 1}

exist.

Rather than applying Newton's method to the system of equations given in (5) to obtain the step directions

δ_{β}, δ_{a}, δ_{s}, δ_{u}

and

δ_{v}

, Mehrotra substituted the steps directly into (5) giving the augmented system of equations

\begin{matrix} X^{T} (a + δ_{a}) & = & (1 - τ) X^{T} e \\ (a + δ_{a}) + (s + δ_{s}) & = & e \\ X (β + δ_{β}) + (u + δ_{u}) - (v + δ_{v}) & = & y \\ (S + Δ_{s}) (U + Δ_{u}) e & = & μ e \\ (A + Δ_{a}) (V + Δ_{v}) e & = & μ e \end{matrix}

(6)

where

Δ_{a}, Δ_{s}, Δ_{u}

and

Δ_{v}

denote the diagonal matrices with diagonal elements given by

δ_{a}, δ_{s}, δ_{u}

and

δ_{v}

respectively.

Affine scaling step

The affine scaling step is constructed by setting

μ = 0

in (5) and applying Newton's method to obtain an intermediate set of step directions

\begin{matrix} (X^{T} W X) δ_{β} & = & X^{T} W (y - X β) + (τ - 1) X^{T} e + X^{T} a \\ δ_{a} & = & W (y - X β - X δ_{β}) \\ δ_{s} & = & - δ_{a} \\ δ_{u} & = & S^{- 1} U δ_{a} - U e \\ δ_{v} & = & A^{- 1} V δ_{s} - V e \end{matrix}

(7)

where

W = {(S^{- 1} U + A^{- 1} V)}^{- 1}

Initial step sizes for the primal (

{\hat{γ}}_{P}

) and dual (

{\hat{γ}}_{D}

) parameters are constructed as

\begin{matrix} {\hat{γ}}_{P} = σ \min \{\min_{i, δ_{a_{i}} < 0} \{a_{i} / δ_{a_{i}}\}, \min_{i, δ_{s_{i}} < 0} \{s_{i} / δ_{s_{i}}\}\} \\ {\hat{γ}}_{D} = σ \min \{\min_{i, δ_{u_{i}} < 0} \{u_{i} / δ_{u_{i}}\}, \min_{i, δ_{v_{i}} < 0} \{v_{i} / δ_{v_{i}}\}\} \end{matrix}

(8)

where

σ

is a user-supplied scaling factor. If

{\hat{γ}}_{P} \times {\hat{γ}}_{D} \geq 1

then the nonlinearity adjustment, described in [Nonlinearity Adjustment], is not made and the model parameters are updated using the current step size and directions.

Nonlinearity Adjustment

In the nonlinearity adjustment step a new estimate of

μ

is obtained by letting

\hat{g} ({\hat{γ}}_{P}, {\hat{γ}}_{D}) = {(s + {\hat{γ}}_{P} δ_{s})}^{T} (u + {\hat{γ}}_{D} δ_{u}) + {(a + {\hat{γ}}_{P} δ_{a})}^{T} (v + {\hat{γ}}_{D} δ_{v})

and estimating

μ

μ = {(\frac{\hat{g} ({\hat{γ}}_{P}, {\hat{γ}}_{D})}{\hat{g} (0, 0)})}^{3} \frac{\hat{g} (0, 0)}{2 n} .

This estimate, along with the nonlinear terms (

Δ u

Δ s

Δ a

and

Δ v

) from (6) are calculated using the values of

δ_{a}, δ_{s}, δ_{u}

and

δ_{v}

obtained from the affine scaling step.

Given an updated estimate for

μ

and the nonlinear terms the system of equations

\begin{matrix} (X^{T} W X) δ_{β} & = & X^{T} W (y - X β + μ (S^{- 1} - A^{- 1}) e + S^{- 1} Δ_{s} Δ_{u} e - A^{- 1} Δ_{a} Δ_{v} e) + (τ - 1) X^{T} e + X^{T} a \\ δ_{a} & = & W (y - X β - X δ_{β} + μ (S^{- 1} - A^{- 1})) \\ δ_{s} & = & - δ_{a} \\ δ_{u} & = & μ S^{- 1} e + S^{- 1} U δ_{a} - U e - S^{- 1} Δ_{s} Δ_{u} e \\ δ_{v} & = & μ A^{- 1} e + A^{- 1} V δ_{s} - V e - A^{- 1} Δ_{a} Δ_{v} e \end{matrix}

(9)

are solved and updated values for

δ_{β}, δ_{a}, δ_{s}, δ_{u}, δ_{v}, {\hat{γ}}_{P}

and

{\hat{γ}}_{D}

calculated.

Update and convergence

At each iteration the model parameters

(β, a, s, u, v)

are updated using step directions,

(δ_{β}, δ_{a}, δ_{s}, δ_{u}, δ_{v})

and step lengths

({\hat{γ}}_{P}, {\hat{γ}}_{D})

Convergence is assessed using the duality gap, that is, the differences between the objective function in the primal and dual formulations. For any feasible point

(u, v, s, a)

the duality gap can be calculated from equations (2) and (3) as

\begin{matrix} τ e^{T} u + (1 - τ) e^{T} v - d^{T} y & = & τ e^{T} u + (1 - τ) e^{T} v - {(a - (1 - τ) e)}^{T} y \\ = & s^{T} u + a^{T} v \\ = & e^{T} u - a^{T} y + (1 - τ) e^{T} X β \end{matrix}

and the optimization terminates if the duality gap is smaller than the tolerance supplied in the optional parameter Tolerance.

Additional information

Initial values are required for the parameters

a, s, u, v

and

β

. If not supplied by the user, initial values for

β

are calculated from a least squares regression of

y

X

. This regression is carried out by first constructing the cross-product matrix

X^{T} X

and then using a pivoted

Q R

decomposition as performed by f08bf. In addition, if the cross-product matrix is not of full rank, a rank reduction is carried out and, rather than using the full design matrix,

X

, a matrix formed from the first

p

-rank columns of

X P

is used instead, where

P

is the pivot matrix used during the

Q R

decomposition. Parameter estimates, confidence intervals and the rows and columns of the matrices returned in the parameter ch (if any) are set to zero for variables dropped during the rank-reduction. The rank reduction step is performed irrespective of whether initial values are supplied by the user.

Once initial values have been obtained for

β

, the initial values for

u

and

v

are calculated from the residuals. If

|r_{i}| < ε_{u}

then a value of

\pm ε_{u}

is used instead, where

ε_{u}

is supplied in the optional parameter Epsilon. The initial values for the

a

and

s

are always set to

1 - τ

and

τ

respectively.

The solution for

δ_{β}

in both (7) and (9) is obtained using a Bunch–Kaufman decomposition, as implemented in (F07MDF not in this release).

Calculation of Covariance Matrix

g02qg supplies four methods to calculate the covariance matrices associated with the parameter estimates for

β

. This section gives some additional detail on three of the algorithms, the fourth, (which uses bootstrapping), is described in [Description].

(i)

Independent, identically distributed (IID) errors

When assuming IID errors, the covariance matrices depend on the sparsity,

s (τ)

, which g02qg estimates as follows:

(a)	Let $r_{i}$ denote the residuals from the original quantile regression, that is $r_{i} = y_{i} - x_{i}^{T} \hat{β}$ .
(b)	Drop any residual where $\|r_{i}\|$ is less than $ε_{u}$ , supplied in the optional parameter Epsilon.
(c)	Sort and relabel the remaining residuals in ascending order, by absolute value, so that $ε_{u} < \|r_{1}\| < \|r_{2}\| < \dots$ .
(d)	Select the first $l$ values where $l = h_{n} n$ , for some bandwidth $h_{n}$ .
(e)	Sort and relabel these $l$ residuals again, so that $r_{1} < r_{2} < \dots < r_{l}$ and regress them against a design matrix with two columns ( $p = 2$ ) and rows given by $x_{i} = \{1, i / (n - p)\}$ using quantile regression with $τ = 0.5$ .
(f)	Use the resulting estimate of the slope as an estimate of the sparsity.

(ii)

Powell Sandwich

When using the Powell Sandwich to estimate the matrix

H_{n}

, the quantity

c_{n} = \min (σ_{r}, (q_{r 3} - q_{r 1}) / 1.34) \times (Φ^{- 1} (τ + h_{n}) - Φ^{- 1} (τ - h_{n}))

is calculated. Dependent on the value of

τ

and the method used to calculate the bandwidth (

h_{n}

), it is possible for the quantities

τ \pm h_{n}

to be too large or small, compared to machine precision (

ε

). More specifically, when

τ - h_{n} \leq \sqrt{ε}

, or

τ + h_{n} \geq 1 - \sqrt{ε}

, a warning flag is raised in info, the value is truncated to

\sqrt{ε}

1 - \sqrt{ε}

respectively and the covariance matrix calculated as usual.

(iii)

Hendricks–Koenker Sandwich

The Hendricks–Koenker Sandwich requires the calculation of the quantity

d_{i} = x_{i}^{T} (\hat{β} (τ + h_{n}) - \hat{β} (τ - h_{n}))

. As with the Powell Sandwich, in cases where

τ - h_{n} \leq \sqrt{ε}

, or

τ + h_{n} \geq 1 - \sqrt{ε}

, a warning flag is raised in info, the value truncated to

\sqrt{ε}

1 - \sqrt{ε}

respectively and the covariance matrix calculated as usual.

In addition, it is required that

d_{i} > 0

, in this method. Hence, instead of using

2 h_{n} / d_{i}

in the calculation of

H_{n}

\max (2 h_{n} / (d_{i} + ε_{u}), 0)

is used instead, where

ε_{u}

is supplied in the optional parameter Epsilon.

Description of Monitoring Information

See the description of the optional argument Monitoring.

Syntax

Parameters

Description

References

Error Indicators and Warnings

Accuracy

Parallelism and Performance

Further Comments

Example

Algorithmic Details

Interior Point Algorithm

Central path

Affine scaling step

Nonlinearity Adjustment

Update and convergence

Additional information

Calculation of Covariance Matrix

Description of Monitoring Information

See Also