g02qg: FL CL CPP AD

NAG CL Interface
g02qgc (quantile_linreg)

Note: this function uses optional parameters to define choices in the problem specification and in the details of the algorithm. If you wish to use default settings for all of the optional parameters, you need only read Sections 1 to 10 of this document. If, however, you wish to reset some or all of the settings please refer to Section 11 for a detailed description of the algorithm, to Section 12 for a detailed description of the specification of the optional parameters and to Section 13 for a detailed description of the monitoring information produced by the function.

Keyword Search:

NAG Library Manual, Mark 27.2

Interfaces: FL CL CPP AD

NAG CL Interface Introduction

G02 (Correg) Chapter Contents

G02 (Correg) Chapter Introduction

g02qg: FL CL CPP AD

▸▿ Contents

Settings help

CL Name Style:
Short (a00aac)
Long (impl_details)
Full (nag_info_impl_details)

1 Purpose

g02qgc performs a multiple linear quantile regression. Parameter estimates and, if required, confidence limits, covariance matrices and residuals are calculated. g02qgc may be used to perform a weighted quantile regression. A simplified interface for g02qgc is provided by g02qfc.

2 Specification

#include <nag.h>

void

g02qgc (Nag_OrderType order, Nag_IncludeIntercept intcpt, Integer n, Integer m, const double dat[], Integer pddat, const Integer isx[], Integer ip, const double y[], const double wt[], Integer ntau, const double tau[], double *df, double b[], double bl[], double bu[], double ch[], double res[], const Integer iopts[], const double opts[], Integer state[], Integer info[], NagError *fail)

The function may be called by the names: g02qgc, nag_correg_quantile_linreg or nag_regsn_quant_linear.

3 Description

Given a vector of

n

observed values,

y = {y_{i} : i = 1, 2, \dots, n}

, an

n \times p

design matrix

X

, a column vector,

x

, of length

p

holding the

i

th row of

X

and a quantile

τ \in (0, 1)

, g02qgc estimates the

p

-element vector

β

as the solution to

\underset{β \in ℝ^{p}}{minimize} \sum_{i = 1}^{n} ρ_{τ} (y_{i} - x_{i}^{T} β)

(1)

where

ρ_{τ}

is the piecewise linear loss function

ρ_{τ} (z) = z (τ - I (z < 0))

, and

I (z < 0)

is an indicator function taking the value

1

z < 0

and

0

otherwise. Weights can be incorporated by replacing

X

and

y

with

W X

and

W y

respectively, where

W

is an

n \times n

diagonal matrix. Observations with zero weights can either be included or excluded from the analysis; this is in contrast to least squares regression where such observations do not contribute to the objective function and are, therefore, always dropped.

g02qgc uses the interior point algorithm of Portnoy and Koenker (1997), described briefly in Section 11, to obtain the parameter estimates

\hat{β}

, for a given value of

τ

Under the assumption of Normally distributed errors, Koenker (2005) shows that the limiting covariance matrix of

\hat{β} - β

has the form

Σ = \frac{τ (1 - τ)}{n} {H_{n}}^{- 1} J_{n} {H_{n}}^{- 1}

where

J_{n} = n^{- 1} \sum_{i = 1}^{n} x_{i} x_{i}^{T}

and

H_{n}

is a function of

τ

, as described below. Given an estimate of the covariance matrix,

\hat{Σ}

, lower (

{\hat{β}}_{L}

) and upper (

{\hat{β}}_{U}

) limits for an

(100 \times α) %

confidence interval can be calculated for each of the

p

parameters, via

{\hat{β}}_{L i} = {\hat{β}}_{i} - t_{n - p, (1 + α) / 2} \sqrt{{\hat{Σ}}_{i i}}, {\hat{β}}_{U i} = {\hat{β}}_{i} + t_{n - p, (1 + α) / 2} \sqrt{{\hat{Σ}}_{i i}}

where

t_{n - p, 0.975}

is the

97.5

percentile of the Student's

t

distribution with

n - k

degrees of freedom, where

k

is the rank of the cross-product matrix

X^{T} X

Four methods for estimating the covariance matrix,

Σ

, are available:

(i)Independent, identically distributed (IID) errors
Under an assumption of IID errors the asymptotic relationship for $Σ$ simplifies to

$Σ = \frac{τ (1 - τ)}{n} {(s (τ))}^{2} {(X^{T} X)}^{- 1}$

where $s$ is the sparsity function. g02qgc estimates $s (τ)$ from the residuals, $r_{i} = y_{i} - x_{i}^{T} \hat{β}$ and a bandwidth $h_{n}$ .
(ii)Powell Sandwich
Powell (1991) suggested estimating the matrix $H_{n}$ by a kernel estimator of the form

${\hat{H}}_{n} = {(n c_{n})}^{- 1} \sum_{i = 1}^{n} K (\frac{r_{i}}{c_{n}}) x_{i} x_{i}^{T}$

where $K$ is a kernel function and $c_{n}$ satisfies $\lim_{n \to \infty} c_{n} \to 0$ and $\lim_{n \to \infty} \sqrt{n} c_{n} \to \infty$ . When the Powell method is chosen, g02qgc uses a Gaussian kernel (i.e., $K = ϕ$ ) and sets

$c_{n} = \min (σ_{r}, (q_{r 3} - q_{r 1}) / 1.34) \times (Φ^{- 1} (τ + h_{n}) - Φ^{- 1} (τ - h_{n}))$

where $h_{n}$ is a bandwidth, $σ_{r}, q_{r 1}$ and $q_{r 3}$ are, respectively, the standard deviation and the $25 %$ and $75 %$ quantiles for the residuals, $r_{i}$ .
(iii)Hendricks–Koenker Sandwich
Koenker (2005) suggested estimating the matrix $H_{n}$ using

${\hat{H}}_{n} = n^{- 1} \sum_{i = 1}^{n} [\frac{2 h_{n}}{x_{i}^{T} (\hat{β} (τ + h_{n}) - \hat{β} (τ - h_{n}))}] x_{i} x_{i}^{T}$

where $h_{n}$ is a bandwidth and $\hat{β} (τ + h_{n})$ denotes the parameter estimates obtained from a quantile regression using the $(τ + h_{n})$ th quantile. Similarly with $\hat{β} (τ - h_{n})$ .
(iv)Bootstrap
The last method uses bootstrapping to either estimate a covariance matrix or obtain confidence intervals for the parameter estimates directly. This method, therefore, does not assume Normally distributed errors. Samples of size $n$ are taken from the paired data ${y_{i}, x_{i}}$ (i.e., the independent and dependent variables are sampled together). A quantile regression is then fitted to each sample resulting in a series of bootstrap estimates for the model parameters, $β$ . A covariance matrix can then be calculated directly from this series of values. Alternatively, confidence limits, ${\hat{β}}_{L}$ and ${\hat{β}}_{U}$ , can be obtained directly from the $(1 - α) / 2$ and $(1 + α) / 2$ sample quantiles of the bootstrap estimates.

Further details of the algorithms used to calculate the covariance matrices can be found in Section 11.

All three asymptotic estimates of the covariance matrix require a bandwidth,

h_{n}

. Two alternative methods for determining this are provided:

(i)Sheather–Hall

$h_{n} = {(\frac{1.5 {(Φ^{- 1} (α_{b}) ϕ (Φ^{- 1} (τ)))}^{2}}{n (2 Φ^{- 1} (τ) + 1)})}^{\frac{1}{3}}$

for a user-supplied value $α_{b}$ ,
(ii)Bofinger

$h_{n} = {(\frac{4.5 {(ϕ (Φ^{- 1} (τ)))}^{4}}{n {(2 Φ^{- 1} (τ) + 1)}^{2}})}^{\frac{1}{5}}$

g02qgc allows optional parameters to be supplied via the iopts and opts arrays (see Section 12 for details of the available options). If the default values for these optional parameters are sufficient then iopts and opts can be set to NULL, otherwise prior to calling g02qgc the optional parameter arrays, must be initialized by calling g02zkc with optstr set to

Initialize = g02qgc

. If bootstrap confidence limits are required (

Interval Method = BOOTSTRAP XY

) then one of the random number initialization functions g05kfc (for a repeatable analysis) or g05kgc (for an unrepeatable analysis) must also have been previously called.

4 References

Koenker R (2005) Quantile Regression Econometric Society Monographs, Cambridge University Press, New York

Mehrotra S (1992) On the implementation of a primal-dual interior point method SIAM J. Optim. 2 575–601

Nocedal J and Wright S J (2006) Numerical Optimization (2nd Edition) Springer Series in Operations Research, Springer, New York

Portnoy S and Koenker R (1997) The Gaussian hare and the Laplacian tortoise: computability of squared-error versus absolute error estimators Statistical Science 4 279–300

Powell J L (1991) Estimation of monotonic regression models under quantile restrictions Nonparametric and Semiparametric Methods in Econometrics Cambridge University Press, Cambridge

5 Arguments

1: $order$ – Nag_OrderType Input

On entry: the order argument specifies the two-dimensional storage scheme being used, i.e., row-major ordering or column-major ordering. C language defined storage is specified by

order = Nag_RowMajor

. See Section 3.1.3 in the Introduction to the NAG Library CL Interface for a more detailed explanation of the use of this argument.

Constraint:

order = Nag_RowMajor

Nag_ColMajor

2: $intcpt$ – Nag_IncludeIntercept Input

On entry: indicates whether an intercept will be included in the model. The intercept is included by adding a column of ones as the first column in the design matrix,

X

$intcpt = Nag_Intercept$: An intercept will be included in the model.
$intcpt = Nag_NoIntercept$: An intercept will not be included in the model.

Constraint:

intcpt = Nag_NoIntercept

Nag_Intercept

3: $n$ – Integer Input

On entry: the total number of observations in the dataset. If no weights are supplied, or no zero weights are supplied or observations with zero weights are included in the model then

n = n

. Otherwise

n = n +

the number of observations with zero weights.

Constraint:

n \geq 2

4: $m$ – Integer Input

On entry:

m

, the total number of variates in the dataset.

Constraint:

m \geq 0

5: $dat [\dim]$ – const double Input

Note: where

DAT (i, j)

appears in this document, it refers to the array element

$dat [(j - 1) \times pddat + i - 1]$ when $order = Nag_ColMajor$ ;
$dat [(i - 1) \times pddat + j - 1]$ when $order = Nag_RowMajor$ .

On entry: the

i

th value for the

j

th variate, for

i = 1, 2, \dots, n

and

j = 1, 2, \dots, m

, must be supplied in

DAT (i, j)

The design matrix

X

is constructed from dat, isx and intcpt.

6: $pddat$ – Integer Input

On entry: the stride separating row or column elements (depending on the value of order) in the array dat.

Constraints:

if $order = Nag_ColMajor$ , $pddat \geq n$ ;
otherwise $pddat \geq m$ .

7: $isx [m]$ – const Integer Input

On entry: indicates which independent variables are to be included in the model.

$isx [j - 1] = 0$: The $j$ th variate, supplied in dat, is not included in the regression model.
$isx [j - 1] = 1$: The $j$ th variate, supplied in dat, is included in the regression model.

Constraints:

$isx [j - 1] = 0$ or $1$ , for $j = 1, 2, \dots, m$ ;
if $intcpt = Nag_Intercept$ , exactly $ip - 1$ values of isx must be set to $1$ ;
if $intcpt = Nag_NoIntercept$ , exactly ip values of isx must be set to $1$ .

8: $ip$ – Integer Input

On entry:

p

, the number of independent variables in the model, including the intercept, see intcpt, if present.

Constraints:

$1 \leq ip < n$ ;
if $intcpt = Nag_Intercept$ , $1 \leq ip \leq m + 1$ ;
if $intcpt = Nag_NoIntercept$ , $1 \leq ip \leq m$ .

9: $y [n]$ – const double Input

On entry:

y

, the observations on the dependent variable.

10: $wt [\dim]$ – const double Input

Note: the dimension, dim, of the array wt must be at least

$n$ , when $wt is not NULL$ ;
otherwise $wt$ is not referenced and may be NULL.

On entry: optionally, the diagonal elements of the weight matrix

W

If weights are not provided then wt must be set to NULL.

When

$Drop Zero Weights = YES$: If $wt [i - 1] = 0.0$ , the $i$ th observation is not included in the model, in which case the effective number of observations, $n$ , is the number of observations with nonzero weights. If $Return Residuals = YES$ , the values of res will be set to zero for observations with zero weights.
$Drop Zero Weights = NO$: All observations are included in the model and the effective number of observations is n, i.e., $n = n$ .

Constraints:

the effective number of observations $\geq 2$ ;
$wt [i] = 0.0$ , for all $i$ .

11: $ntau$ – Integer Input

On entry: the number of quantiles of interest.

Constraint:

ntau \geq 1

12: $tau [ntau]$ – const double Input

On entry: the vector of quantiles of interest. A separate model is fitted to each quantile.

Constraint:

\sqrt{ε} < tau [j - 1] < 1 - \sqrt{ε}

where

ε

is the machine precision returned by X02AJC, for

j = 1, 2, \dots, ntau

13: $df$ – double * Output

On exit: the degrees of freedom given by

n - k

, where

n

is the effective number of observations and

k

is the rank of the cross-product matrix

X^{T} X

14: $b [ip \times ntau]$ – double Input/Output

Note: where

B (i, l)

appears in this document, it refers to the array element

b [(l - 1) \times ip + i - 1]

On entry: if

Calculate Initial Values = NO

B (i, l)

must hold an initial estimates for

{\hat{β}}_{i}

, for

i = 1, 2, \dots, ip

and

l = 1, 2, \dots, ntau

. If

Calculate Initial Values = YES

, b need not be set.

On exit:

B (i, l)

, for

i = 1, 2, \dots, ip

, contains the estimates of the parameters of the regression model,

\hat{β}

, estimated for

τ = tau [l - 1]

intcpt = Nag_Intercept

B (1, l)

will contain the estimate corresponding to the intercept and

B (i + 1, l)

will contain the coefficient of the

j

th variate contained in dat, where

isx [j - 1]

is the

i

th nonzero value in the array isx.

intcpt = Nag_NoIntercept

B (i, l)

will contain the coefficient of the

j

th variate contained in dat, where

isx [j - 1]

is the

i

th nonzero value in the array isx.

15: $bl [\dim]$ – double Output

Note: the dimension, dim, of the array bl must be at least

ntau

when

Interval Method \neq NONE

where

BL (i, l)

appears in this document, it refers to the array element

bl [(l - 1) \times ip + i - 1]

On exit: if

Interval Method \neq NONE

BL (i, l)

contains the lower limit of an

(100 \times α) %

confidence interval for

B (i, l)

, for

i = 1, 2, \dots, ip

and

l = 1, 2, \dots, ntau

Interval Method = NONE

, bl is not referenced and can be set to NULL.

The method used for calculating the interval is controlled by the optional parameters

Interval Method

and

Bootstrap Interval Method

. The size of the interval,

α

, is controlled by the optional parameter

Significance Level

16: $bu [\dim]$ – double Output

Note: the dimension, dim, of the array bu must be at least

ntau

when

Interval Method \neq NONE

where

BU (i, l)

appears in this document, it refers to the array element

bu [(l - 1) \times ip + i - 1]

On exit: if

Interval Method \neq NONE

BU (i, l)

contains the upper limit of an

(100 \times α) %

confidence interval for

B (i, l)

, for

i = 1, 2, \dots, ip

and

l = 1, 2, \dots, ntau

Interval Method = NONE

, bu is not referenced and can be set to NULL.

The method used for calculating the interval is controlled by the optional parameters

Interval Method

and

Bootstrap Interval Method

. The size of the interval,

α

is controlled by the optional parameter

Significance Level

17: $ch [\dim]$ – double Output

Note: the dimension, dim, of the array ch must be at least

if $Interval Method \neq NONE$ and $Matrix Returned = COVARIANCE$ , $ip \times ip \times ntau$ ;
if $Interval Method \neq NONE$ , $IID$ or $BOOTSTRAP XY$ and $Matrix Returned = H INVERSE$ , $ip \times ip \times (ntau + 1)$ .

where

CH (i, j, l)

appears in this document, it refers to the array element

ch [(l - 1) \times ip \times ip + (j - 1) \times ip + i - 1]

On exit: depending on the supplied optional parameters, ch will either not be referenced, hold an estimate of the upper triangular part of the covariance matrix,

Σ

, or an estimate of the upper triangular parts of

n J_{n}

and

n^{- 1} H_{n}^{- 1}

Interval Method = NONE

Matrix Returned = NONE

, ch is not referenced.

Interval Method = BOOTSTRAP XY

IID

and

Matrix Returned = H INVERSE

, ch is not referenced.

Otherwise, for

i, j = 1, 2, \dots, ip, j \geq i

and

l = 1, 2, \dots, ntau

If $Matrix Returned = COVARIANCE$ , $CH (i, j, l)$ holds an estimate of the covariance between $B (i, l)$ and $B (j, l)$ .
If $Matrix Returned = H INVERSE$ , $CH (i, j, 1)$ holds an estimate of the $(i, j)$ th element of $n J_{n}$ and $CH (i, j, l + 1)$ holds an estimate of the $(i, j)$ th element of $n^{- 1} H_{n}^{- 1}$ , for $τ = tau [l - 1]$ .

The method used for calculating

Σ

and

H_{n}^{- 1}

is controlled by the optional parameter

Interval Method

In cases where ch is not going to be referenced it can be set to NULL.

18: $res [n \times ntau]$ – double Output

Note: the

(i, j)

th element of the matrix is stored in

res [(j - 1) \times n + i - 1]

On exit: if

Return Residuals = YES

res [(l - 1) \times n + i - 1]

holds the (weighted) residuals,

r_{i}

, for

τ = tau [l - 1]

, for

i = 1, 2, \dots, n

and

l = 1, 2, \dots, ntau

wt is not NULL

and

Drop Zero Weights = YES

, the value of res will be set to zero for observations with zero weights.

Return Residuals = NO

, res is not referenced and can be set to NULL.

19: $iopts [\dim]$ – const Integer Communication Array

Note: the dimension,

\dim

, of this array is dictated by the requirements of associated functions that must have been previously called. This array MUST be the same array passed as argument iopts in the previous call to g02zkc.

On entry: if the default values of the optional parameters are sufficient, iopts can be set to NULL, otherwise the optional parameter array, as initialized by a call to g02zkc must be supplied.

20: $opts [\dim]$ – const double Communication Array

Note: the dimension,

\dim

, of this array is dictated by the requirements of associated functions that must have been previously called. This array MUST be the same array passed as argument opts in the previous call to g02zkc.

On entry: if the default values of the optional parameters are sufficient, opts can be set to NULL, otherwise the optional parameter array, as initialized by a call to g02zkc must be supplied.

21: $state [\dim]$ – Integer Communication Array

Note: the dimension,

\dim

, of this array is dictated by the requirements of associated functions that must have been previously called. This array MUST be the same array passed as argument state in the previous call to nag_rand_init_repeatable (g05kfc) or nag_rand_init_nonrepeatable (g05kgc).

Interval Method = BOOTSTRAP XY

, state contains information about the selected random number generator. Otherwise state is not referenced and can be set to NULL.

22: $info [ntau]$ – Integer Output

On exit:

info [i]

holds additional information concerning the model fitting and confidence limit calculations when

τ = tau [i]

Code	Warning
$0$	Model fitted and confidence limits (if requested) calculated successfully
$1$	The function did not converge. The returned values are based on the estimate at the last iteration. Try increasing $Iteration Limit$ whilst calculating the parameter estimates or relaxing the definition of convergence by increasing $Tolerance$ .
$2$	A singular matrix was encountered during the optimization. The model was not fitted for this value of $τ$ .
$4$	Some truncation occurred whilst calculating the confidence limits for this value of $τ$ . See Section 11 for details. The returned upper and lower limits may be narrower than specified.
$8$	The function did not converge whilst calculating the confidence limits. The returned limits are based on the estimate at the last iteration. Try increasing $Iteration Limit$ .
$16$	Confidence limits for this value of $τ$ could not be calculated. The returned upper and lower limits are set to a large positive and large negative value respectively as defined by the optional parameter $Big$ .

It is possible for multiple warnings to be applicable to a single model. In these cases the value returned in info is the sum of the corresponding individual nonzero warning codes.

23: $fail$ – NagError * Input/Output

The NAG error argument (see Section 7 in the Introduction to the NAG Library CL Interface).

6 Error Indicators and Warnings

NE_ALLOC_FAIL: Dynamic memory allocation failed.
See Section 3.1.2 in the Introduction to the NAG Library CL Interface for further information.
NE_ARRAY_SIZE: On entry, $pddat = ⟨ value ⟩$ and $m = ⟨ value ⟩$ .
Constraint: $pddat \geq m$ .

On entry, $pddat = ⟨ value ⟩$ and $n = ⟨ value ⟩$ .
Constraint: $pddat \geq n$ .
NE_BAD_PARAM: On entry, argument $⟨ value ⟩$ had an illegal value.
NE_INITIALIZATION: On entry, either the option arrays have not been initialized or they have been corrupted.
NE_INT: On entry, $m = ⟨ value ⟩$ .
Constraint: $m \geq 0$ .

On entry, $n = ⟨ value ⟩$ .
Constraint: $n \geq 2$ .

On entry, $ntau = ⟨ value ⟩$ .
Constraint: $ntau \geq 1$ .
NE_INT_2: On entry, $ip = ⟨ value ⟩$ and $n = ⟨ value ⟩$ .
Constraint: $1 \leq ip < n$ .
NE_INT_ARRAY: On entry, $isx [⟨ value ⟩] = ⟨ value ⟩$ .
Constraint: $isx [i] = 0$ or $1$ , for all $i$ .
NE_INTERNAL_ERROR: An internal error has occurred in this function. Check the function call and any array sizes. If the call is correct then please contact NAG for assistance.
See Section 7.5 in the Introduction to the NAG Library CL Interface for further information.
NE_INVALID_STATE: On entry, state vector has been corrupted or not initialized.
NE_IP_INCOMP_SX: On entry, ip is not consistent with isx or intcpt: $ip = ⟨ value ⟩$ , $expected value = ⟨ value ⟩$ .
NE_NEG_WEIGHT: On entry, $wt [⟨ value ⟩] = ⟨ value ⟩$ .
Constraint: $wt [i] \geq 0.0$ , for all $i$ .
NE_NO_LICENCE: Your licence key may have expired or may not have been installed correctly.
See Section 8 in the Introduction to the NAG Library CL Interface for further information.
NE_OBSERVATIONS: On entry, $effective number of observations = ⟨ value ⟩$ .
Constraint: $effective number of observations \geq ⟨ value ⟩$ .
NE_REAL_ARRAY: On entry, $tau [⟨ value ⟩] = ⟨ value ⟩$ .
Constraint: $\sqrt{ε} < tau [l - 1] < 1 - \sqrt{ε}$ where $ε$ is the machine precision returned by X02AJC, for all ntau.
NW_POTENTIAL_PROBLEM: A potential problem occurred whilst fitting the model(s).
Additional information has been returned in info.

7 Accuracy

Not applicable.

8 Parallelism and Performance

g02qgc is threaded by NAG for parallel execution in multithreaded implementations of the NAG Library.

g02qgc makes calls to BLAS and/or LAPACK routines, which may be threaded within the vendor library used by this implementation. Consult the documentation for the vendor library for further information.

Please consult the X06 Chapter Introduction for information on how to control and interrogate the OpenMP environment used within this function. Please also consult the Users' Note for your implementation for any additional implementation-specific information.

9 Further Comments

g02qgc allocates internally approximately the following elements of double storage:

13 n + n p + 3 p^{2} + 6 p + 3 (p + 1) \times ntau

. If

Interval Method = BOOTSTRAP XY

then a further

n p

elements are required, and this increases by

p \times ntau \times Bootstrap Iterations

Bootstrap Interval Method = QUANTILE

. Where possible, any user-supplied output arrays are used as workspace and so the amount actually allocated may be less. If

order = Nag_RowMajor

wt is NULL

intcpt = Nag_NoIntercept

and

ip = m

an internal copy of the input data is avoided and the amount of locally allocated memory is reduced by

n p

10 Example

A quantile regression model is fitted to Engels 1857 study of household expenditure on food. The model regresses the dependent variable, household food expenditure, against two explanatory variables, a column of ones and household income. The model is fit for five different values of

τ

and the covariance matrix is estimated assuming Normal IID errors. Both the covariance matrix and the residuals are returned.

11 Algorithmic Details

By the addition of slack variables the minimization (1) can be reformulated into the linear programming problem

\underset{(u, v, β) \in ℝ_{+}^{n} \times ℝ_{+}^{n} \times ℝ^{p}}{minimize} τ e^{T} u + (1 - τ) e^{T} v ​   subject to y = X β + u - v

(2)

and its associated dual

\underset{d}{maximize} y^{T} d ​   subject to X^{T} d = 0, d \in {[τ - 1, τ]}^{n}

(3)

where

e

is a vector of

n

1

s. Setting

a = d + (1 - τ) e

gives the equivalent formulation

\underset{a}{maximize} y^{T} a ​   subject to X^{T} a = (1 - τ) X^{T} e, a \in {[0, 1]}^{n} .

(4)

The algorithm introduced by Portnoy and Koenker (1997) and used by g02qgc, uses the primal-dual formulation expressed in equations (2) and (4) along with a logarithmic barrier function to obtain estimates for

β

. The algorithm is based on the predictor-corrector algorithm of Mehrotra (1992) and further details can be obtained from Portnoy and Koenker (1997) and Koenker (2005). A good description of linear programming, interior point algorithms, barrier functions and Mehrotra's predictor-corrector algorithm can be found in Nocedal and Wright (2006).

11.1 Interior Point Algorithm

In this section a brief description of the interior point algorithm used to estimate the model parameters is presented. It should be noted that there are some differences in the equations given here – particularly (7) and (9) – compared to those given in Koenker (2005) and Portnoy and Koenker (1997).

11.1.1 Central path

Rather than optimize (4) directly, an additional slack variable

s

is added and the constraint

a \in {[0, 1]}^{n}

is replaced with

a + s = e, a_{i} \geq 0, s_{i} \geq 0

, for

i = 1, 2, \dots, n

The positivity constraint on

a

and

s

is handled using the logarithmic barrier function

B (a, s, μ) = y^{T} a + μ \sum_{i = 1}^{n} (\log a_{i} + \log s_{i}) .

The primal-dual form of the problem is used giving the Lagrangian

L (a, s, β, u, μ) = B (a, s, μ) - β^{T} (X^{T} a - (1 - τ) X^{T} e) - u^{T} (a + s - e)

whose central path is described by the following first order conditions

\begin{matrix} X^{T} a & = & (1 - τ) X^{T} e \\ a + s & = & e \\ X β + u - v & = & y \\ S U e & = & μ e \\ A V e & = & μ e \end{matrix}

(5)

where

A

denotes the diagonal matrix with diagonal elements given by

a

, similarly with

S, U

and

V

. By enforcing the inequalities on

s

and

a

strictly, i.e.,

a_{i} > 0

and

s_{i} > 0

for all

i

we ensure that

A

and

S

are positive definite diagonal matrices and hence

A^{- 1}

and

S^{- 1}

exist.

Rather than applying Newton's method to the system of equations given in (5) to obtain the step directions

δ_{β}, δ_{a}, δ_{s}, δ_{u}

and

δ_{v}

, Mehrotra substituted the steps directly into (5) giving the augmented system of equations

\begin{matrix} X^{T} (a + δ_{a}) & = & (1 - τ) X^{T} e \\ (a + δ_{a}) + (s + δ_{s}) & = & e \\ X (β + δ_{β}) + (u + δ_{u}) - (v + δ_{v}) & = & y \\ (S + Δ_{s}) (U + Δ_{u}) e & = & μ e \\ (A + Δ_{a}) (V + Δ_{v}) e & = & μ e \end{matrix}

(6)

where

Δ_{a}, Δ_{s}, Δ_{u}

and

Δ_{v}

denote the diagonal matrices with diagonal elements given by

δ_{a}, δ_{s}, δ_{u}

and

δ_{v}

respectively.

11.1.2 Affine scaling step

The affine scaling step is constructed by setting

μ = 0

in (5) and applying Newton's method to obtain an intermediate set of step directions

\begin{matrix} (X^{T} W X) δ_{β} & = & X^{T} W (y - X β) + (τ - 1) X^{T} e + X^{T} a \\ δ_{a} & = & W (y - X β - X δ_{β}) \\ δ_{s} & = & - δ_{a} \\ δ_{u} & = & S^{- 1} U δ_{a} - U e \\ δ_{v} & = & A^{- 1} V δ_{s} - V e \end{matrix}

(7)

where

W = {(S^{- 1} U + A^{- 1} V)}^{- 1}

Initial step sizes for the primal (

{\hat{γ}}_{P}

) and dual (

{\hat{γ}}_{D}

) parameters are constructed as

\begin{matrix} {\hat{γ}}_{P} = σ \min {\min_{i, δ_{a_{i}} < 0} {a_{i} / δ_{a_{i}}}, \min_{i, δ_{s_{i}} < 0} {s_{i} / δ_{s_{i}}}} \\ {\hat{γ}}_{D} = σ \min {\min_{i, δ_{u_{i}} < 0} {u_{i} / δ_{u_{i}}}, \min_{i, δ_{v_{i}} < 0} {v_{i} / δ_{v_{i}}}} \end{matrix}

(8)

where

σ

is a user-supplied scaling factor. If

{\hat{γ}}_{P} \times {\hat{γ}}_{D} \geq 1

then the nonlinearity adjustment, described in Section 11.1.3, is not made and the model parameters are updated using the current step size and directions.

11.1.3 Nonlinearity Adjustment

In the nonlinearity adjustment step a new estimate of

μ

is obtained by letting

\hat{g} ({\hat{γ}}_{P}, {\hat{γ}}_{D}) = {(s + {\hat{γ}}_{P} δ_{s})}^{T} (u + {\hat{γ}}_{D} δ_{u}) + {(a + {\hat{γ}}_{P} δ_{a})}^{T} (v + {\hat{γ}}_{D} δ_{v})

and estimating

μ

μ = {(\frac{\hat{g} ({\hat{γ}}_{P}, {\hat{γ}}_{D})}{\hat{g} (0, 0)})}^{3} \frac{\hat{g} (0, 0)}{2 n} .

This estimate, along with the nonlinear terms (

Δ u

Δ s

Δ a

and

Δ v

) from (6) are calculated using the values of

δ_{a}, δ_{s}, δ_{u}

and

δ_{v}

obtained from the affine scaling step.

Given an updated estimate for

μ

and the nonlinear terms the system of equations

\begin{matrix} (X^{T} W X) δ_{β} & = & X^{T} W (y - X β + μ (S^{- 1} - A^{- 1}) e + S^{- 1} Δ_{s} Δ_{u} e - A^{- 1} Δ_{a} Δ_{v} e) + (τ - 1) X^{T} e + X^{T} a \\ δ_{a} & = & W (y - X β - X δ_{β} + μ (S^{- 1} - A^{- 1})) \\ δ_{s} & = & - δ_{a} \\ δ_{u} & = & μ S^{- 1} e + S^{- 1} U δ_{a} - U e - S^{- 1} Δ_{s} Δ_{u} e \\ δ_{v} & = & μ A^{- 1} e + A^{- 1} V δ_{s} - V e - A^{- 1} Δ_{a} Δ_{v} e \end{matrix}

(9)

are solved and updated values for

δ_{β}, δ_{a}, δ_{s}, δ_{u}, δ_{v}, {\hat{γ}}_{P}

and

{\hat{γ}}_{D}

calculated.

11.1.4 Update and convergence

At each iteration the model parameters

(β, a, s, u, v)

are updated using step directions,

(δ_{β}, δ_{a}, δ_{s}, δ_{u}, δ_{v})

and step lengths

({\hat{γ}}_{P}, {\hat{γ}}_{D})

Convergence is assessed using the duality gap, that is, the differences between the objective function in the primal and dual formulations. For any feasible point

(u, v, s, a)

the duality gap can be calculated from equations (2) and (3) as

\begin{matrix} τ e^{T} u + (1 - τ) e^{T} v - d^{T} y & = & τ e^{T} u + (1 - τ) e^{T} v - {(a - (1 - τ) e)}^{T} y \\ = & s^{T} u + a^{T} v \\ = & e^{T} u - a^{T} y + (1 - τ) e^{T} X β \end{matrix}

and the optimization terminates if the duality gap is smaller than the tolerance supplied in the optional parameter

Tolerance

11.1.5 Additional information

Initial values are required for the parameters

a, s, u, v

and

β

. If you have not supplied them, initial values for

β

are calculated from a least squares regression of

y

X

. This regression is carried out by first constructing the cross-product matrix

X^{T} X

and then using a pivoted

Q R

decomposition as performed by f08bfc. In addition, if the cross-product matrix is not of full rank, a rank reduction is carried out and, rather than using the full design matrix,

X

, a matrix formed from the first

p

-rank columns of

X P

is used instead, where

P

is the pivot matrix used during the

Q R

decomposition. Parameter estimates, confidence intervals and the rows and columns of the matrices returned in the argument ch (if any) are set to zero for variables dropped during the rank-reduction. The rank reduction step is performed irrespective of whether initial values are supplied by the user.

Once initial values have been obtained for

β

, the initial values for

u

and

v

are calculated from the residuals. If

| r_{i} | < ε_{u}

then a value of

\pm ε_{u}

is used instead, where

ε_{u}

is supplied in the optional parameter

Epsilon

. The initial values for the

a

and

s

are always set to

1 - τ

and

τ

respectively.

The solution for

δ_{β}

in both (7) and (9) is obtained using a Bunch–Kaufman decomposition, as implemented in f07mdc.

11.2 Calculation of Covariance Matrix

g02qgc supplies four methods to calculate the covariance matrices associated with the parameter estimates for

β

. This section gives some additional detail on three of the algorithms, the fourth, (which uses bootstrapping), is described in Section 3.

(i)Independent, identically distributed (IID) errors
When assuming IID errors, the covariance matrices depend on the sparsity, $s (τ)$ , which g02qgc estimates as follows:
1. (a)Let $r_{i}$ denote the residuals from the original quantile regression, that is $r_{i} = y_{i} - x_{i}^{T} \hat{β}$ .
2. (b)Drop any residual where $| r_{i} |$ is less than $ε_{u}$ , supplied in the optional parameter $Epsilon$ .
3. (c)Sort and relabel the remaining residuals in ascending order, by absolute value, so that $ε_{u} < | r_{1} | < | r_{2} | < \dots$ .
4. (d)Select the first $l$ values where $l = h_{n} n$ , for some bandwidth $h_{n}$ .
5. (e)Sort and relabel these $l$ residuals again, so that $r_{1} < r_{2} < \dots < r_{l}$ and regress them against a design matrix with two columns ( $p = 2$ ) and rows given by $x_{i} = {1, i / (n - p)}$ using quantile regression with $τ = 0.5$ .
6. (f)Use the resulting estimate of the slope as an estimate of the sparsity.
(ii)Powell Sandwich
When using the Powell Sandwich to estimate the matrix $H_{n}$ , the quantity

$c_{n} = \min (σ_{r}, (q_{r 3} - q_{r 1}) / 1.34) \times (Φ^{- 1} (τ + h_{n}) - Φ^{- 1} (τ - h_{n}))$

is calculated. Dependent on the value of $τ$ and the method used to calculate the bandwidth ( $h_{n}$ ), it is possible for the quantities $τ \pm h_{n}$ to be too large or small, compared to machine precision ( $ε$ ). More specifically, when $τ - h_{n} \leq \sqrt{ε}$ , or $τ + h_{n} \geq 1 - \sqrt{ε}$ , a warning flag is raised in info, the value is truncated to $\sqrt{ε}$ or $1 - \sqrt{ε}$ respectively and the covariance matrix calculated as usual.
(iii)Hendricks–Koenker Sandwich
The Hendricks–Koenker Sandwich requires the calculation of the quantity $d_{i} = x_{i}^{T} (\hat{β} (τ + h_{n}) - \hat{β} (τ - h_{n}))$ . As with the Powell Sandwich, in cases where $τ - h_{n} \leq \sqrt{ε}$ , or $τ + h_{n} \geq 1 - \sqrt{ε}$ , a warning flag is raised in info, the value truncated to $\sqrt{ε}$ or $1 - \sqrt{ε}$ respectively and the covariance matrix calculated as usual.

In addition, it is required that $d_{i} > 0$ , in this method. Hence, instead of using $2 h_{n} / d_{i}$ in the calculation of $H_{n}$ , $\max (2 h_{n} / (d_{i} + ε_{u}), 0)$ is used instead, where $ε_{u}$ is supplied in the optional parameter $Epsilon$ .

12 Optional Parameters

Several optional parameters in g02qgc control aspects of the optimization algorithm, methodology used, logic or output. Their values are contained in the arrays iopts and opts; these must be initialized before calling g02qgc by first calling g02zkc with optstr set to

Initialize = g02qgc

Each optional parameter has an associated default value; to set any of them to a non-default value, use g02zkc. The current value of an optional parameter can be queried using g02zlc.

The remainder of this section can be skipped if you wish to use the default values for all optional parameters.

The following is a list of the optional parameters available. A full description of each optional parameter is provided in Section 12.1.

12.1 Description of the Optional Parameters

For each option, we give a summary line, a description of the optional parameter and details of constraints.

The summary line contains:

the keywords, where the minimum abbreviation of each keyword is underlined (if no characters of an optional qualifier are underlined, the qualifier may be omitted);
a parameter value, where the letters $a$ , $i$ and $r$ denote options that take character, integer and real values respectively;
the default value, where the symbol $ε$ is a generic notation for machine precision (see X02AJC).

Keywords and character values are case and white space insensitive.

Band Width Alpha

r

Default

= 1.0

A multiplier used to construct the parameter

α_{b}

used when calculating the Sheather–Hall bandwidth (see Section 3), with

α_{b} = (1 - α) \times Band Width Alpha

. Here,

α

is the

Significance Level

Constraint:

Band Width Alpha > 0.0

Band Width Method

a

Default

= SHEATHER HALL

The method used to calculate the bandwidth used in the calculation of the asymptotic covariance matrix

Σ

and

H^{- 1}

Interval Method = HKS

KERNEL

IID

(see Section 3).

Constraint:

Band Width Method = SHE ATHER HAL L

BOF INGER

Big

r

Default

= {10.0}^{20}

This parameter should be set to something larger than the biggest value supplied in dat and y.

Constraint:

Big > 0.0

Bootstrap Interval Method

a

Default

= QUANTILE

Interval Method = BOOTSTRAP XY

Bootstrap Interval Method

controls how the confidence intervals are calculated from the bootstrap estimates.

$Bootstrap Interval Method = T$: $t$ intervals are calculated. That is, the covariance matrix, $Σ = {σ_{i j} : i, j = 1, 2, \dots, p}$ is calculated from the bootstrap estimates and the limits calculated as $β_{i} \pm t_{(n - p, (1 + α) / 2)} σ_{i i}$ where $t_{(n - p, (1 + α) / 2)}$ is the $(1 + α) / 2$ percentage point from a Student's $t$ distribution on $n - p$ degrees of freedom, $n$ is the effective number of observations and $α$ is given by the optional parameter $Significance Level$ .
$Bootstrap Interval Method = QUANTILE$: Quantile intervals are calculated. That is, the upper and lower limits are taken as the $(1 + α) / 2$ and $(1 - α) / 2$ quantiles of the bootstrap estimates, as calculated using g01amc.

Constraint:

Bootstrap Interval Method = T

QUA NTILE

Bootstrap Iterations

i

Default

= 100

The number of bootstrap samples used to calculate the confidence limits and covariance matrix (if requested) when

Interval Method = BOOTSTRAP XY

Constraint:

Bootstrap Iterations > 1

Bootstrap Monitoring

a

Default

= NO

Bootstrap Monitoring = YES

and

Interval Method = BOOTSTRAP XY

, the parameter estimates for each of the bootstrap samples are displayed. This information is sent to the unit number specified by

Unit Number

Constraint:

Bootstrap Monitoring = YES

NO

Calculate Initial Values

a

Default

= YES

Calculate Initial Values = YES

then the initial values for the regression parameters,

β

, are calculated from the data. Otherwise they must be supplied in b.

Constraint:

Calculate Initial Values = YES

NO

Defaults

This special keyword is used to reset all optional parameters to their default values.

Drop Zero Weights

a

Default

= YES

If a weighted regression is being performed and

Drop Zero Weights = YES

then observations with zero weight are dropped from the analysis. Otherwise such observations are included.

Constraint:

Drop Zero Weights = YES

NO

Epsilon

r

Default

= \sqrt{ε}

ε_{u}

, the tolerance used when calculating the covariance matrix and the initial values for

u

and

v

. For additional details see Section 11.2 and Section 11.1.5 respectively.

Constraint:

Epsilon \geq 0.0

Interval Method

a

Default

= IID

The value of

Interval Method

controls whether confidence limits are returned in bl and bu and how these limits are calculated. This parameter also controls how the matrices returned in ch are calculated.

$Interval Method = NONE$: No limits are calculated and bl, bu and ch are not referenced.
$Interval Method = KERNEL$: The Powell Sandwich method with a Gaussian kernel is used.
$Interval Method = HKS$: The Hendricks–Koenker Sandwich is used.
$Interval Method = IID$: The errors are assumed to be identical, and independently distributed.
$Interval Method = BOOTSTRAP XY$: A bootstrap method is used, where sampling is done on the pair $(y_{i}, x_{i})$ . The number of bootstrap samples is controlled by the parameter $Bootstrap Iterations$ and the type of interval constructed from the bootstrap samples is controlled by $Bootstrap Interval Method$ .

Constraint:

Interval Method = NON E

KER NEL

HKS

IID

BOO TSTRAP XY

Iteration Limit

i

Default

= 100

The maximum number of iterations to be performed by the interior point optimization algorithm.

Constraint:

Iteration Limit > 0

Matrix Returned

a

Default

= NONE

The value of

Matrix Returned

controls the type of matrices returned in ch. If

Interval Method = NONE

, this parameter is ignored and ch is not referenced. Otherwise:

$Matrix Returned = NONE$: No matrices are returned and ch is not referenced.
$Matrix Returned = COVARIANCE$: The covariance matrices are returned.
$Matrix Returned = H INVERSE$: If $Interval Method = KERNEL$ or $HKS$ , the matrices $J$ and $H^{- 1}$ are returned. Otherwise no matrices are returned and ch is not referenced.

The matrices returned are calculated as described in Section 3, with the algorithm used specified by

Interval Method

. In the case of

Interval Method = BOOTSTRAP XY

the covariance matrix is calculated directly from the bootstrap estimates.

Constraint:

Matrix Returned = NON E

COV ARIANCE

H INV ERSE

Monitoring

a

Default

= NO

Monitoring = YES

then the duality gap is displayed at each iteration of the interior point optimization algorithm. In addition, the final estimates for

β

are also displayed.

The monitoring information is sent to the unit number specified by

Unit Number

Constraint:

Monitoring = YES

NO

QR Tolerance

r

Default

= ε^{0.9}

The tolerance used to calculate the rank,

k

, of the

p \times p

cross-product matrix,

X^{T} X

. Letting

Q

be the orthogonal matrix obtained from a

Q R

decomposition of

X^{T} X

, then the rank is calculated by comparing

Q_{i i}

with

Q_{11} \times QR Tolerance

If the cross-product matrix is rank deficient, the parameter estimates for the

p - k

columns with the smallest values of

Q_{i i}

are set to zero, along with the corresponding entries in bl, bu and ch, if returned. This is equivalent to dropping these variables from the model. Details on the

Q R

decomposition used can be found in f08bfc.

Constraint:

QR Tolerance > 0.0

Return Residuals

a

Default

= NO

Return Residuals = YES

, the residuals are returned in res. Otherwise res is not referenced.

Constraint:

Return Residuals = YES

NO

Sigma

r

Default

= 0.99995

The scaling factor used when calculating the affine scaling step size (see equation (8)).

Constraint:

0.0 < Sigma < 1.0

Significance Level

r

Default

= 0.95

α

, the size of the confidence interval whose limits are returned in bl and bu.

Constraint:

0.0 < Significance Level < 1.0

Tolerance

r

Default

= \sqrt{ε}

Convergence tolerance. The optimization is deemed to have converged if the duality gap is less than

Tolerance

(see Section 11.1.4).

Constraint:

Tolerance > 0.0

Unit Number

i

Output sent to stdout

The unit number to which any monitoring information is sent. See x04acc for details on how to assign a file to a unit number. If no unit number is specified then any monitoring information will be sent to standard output (stdout).

Constraint:

Unit Number > 1

13 Description of Monitoring Information

See the description of the optional parameter

Monitoring

NAG Library Manual, Mark 27.2

Interfaces: FL CL CPP AD

NAG CL Interface Introduction

G02 (Correg) Chapter Contents

G02 (Correg) Chapter Introduction

g02qg: FL CL CPP AD

NAG CL Interfaceg02qgc (quantile_​linreg)

▸▿ Contents

1 Purpose

2 Specification

3 Description

4 References

5 Arguments

6 Error Indicators and Warnings

7 Accuracy

8 Parallelism and Performance

9 Further Comments

10 Example

10.1 Program Text

10.2 Program Data

10.3 Program Results

11 Algorithmic Details

11.1 Interior Point Algorithm

11.1.1 Central path

11.1.2 Affine scaling step

11.1.3 Nonlinearity Adjustment

11.1.4 Update and convergence

11.1.5 Additional information

11.2 Calculation of Covariance Matrix

12 Optional Parameters

12.1 Description of the Optional Parameters

13 Description of Monitoring Information

NAG CL Interface
g02qgc (quantile_linreg)