NAG FL Interface
g02mcf (lars_param)
1
Purpose
g02mcf calculates additional parameter estimates following Least Angle Regression (LARS), forward stagewise linear regression or Least Absolute Shrinkage and Selection Operator (LASSO) as performed by
g02maf and
g02mbf.
2
Specification
Fortran Interface
Subroutine g02mcf ( |
nstep, ip, b, ldb, fitsum, ktype, nk, lnk, nb, ldnb, ifail) |
Integer, Intent (In) |
:: |
nstep, ip, ldb, ktype, lnk, ldnb |
Integer, Intent (Inout) |
:: |
ifail |
Real (Kind=nag_wp), Intent (In) |
:: |
b(ldb,*), fitsum(6,nstep+1), nk(lnk) |
Real (Kind=nag_wp), Intent (Inout) |
:: |
nb(ldnb,*) |
|
C Header Interface
#include <nag.h>
void |
g02mcf_ (const Integer *nstep, const Integer *ip, const double b[], const Integer *ldb, const double fitsum[], const Integer *ktype, const double nk[], const Integer *lnk, double nb[], const Integer *ldnb, Integer *ifail) |
|
C++ Header Interface
#include <nag.h> extern "C" {
void |
g02mcf_ (const Integer &nstep, const Integer &ip, const double b[], const Integer &ldb, const double fitsum[], const Integer &ktype, const double nk[], const Integer &lnk, double nb[], const Integer &ldnb, Integer &ifail) |
}
|
The routine may be called by the names g02mcf or nagf_correg_lars_param.
3
Description
g02maf and
g02mbf fit either a LARS, forward stagewise linear regression, LASSO or positive LASSO model to a vector of
observed values,
and an
design matrix
, where the
th column of
is given by the
th independent variable
. The models are fit using the LARS algorithm of
Efron et al. (2004).
The full solution path for all four of these models follow a similar pattern where the parameter estimate for a given variable is piecewise linear. One such path, for a LARS model with six variables
can be seen in
Figure 1. Both
g02maf and
g02mbf return the vector of
parameter estimates,
, at
points along this path (so
). Each point corresponds to a step of the LARS algorithm. The number of steps taken depends on the model being fitted. In the case of a LARS model,
and each step corresponds to a new variable being included in the model. In the case of the LASSO models, each step corresponds to either a new variable being included in the model or an existing variable being removed from the model; the value of
is therefore no longer bound by the number of parameters. For forward stagewise linear regression, each step no longer corresponds to the addition or removal of a variable; therefore the number of possible steps is often markedly greater than for a corresponding LASSO model.
g02mcf uses the piecewise linear nature of the solution path to predict the parameter estimates, , at a different point on this path. The location of the solution can either be defined in terms of a (fractional) step number or a function of the norm of the parameter estimates.
4
References
Efron B, Hastie T, Johnstone I and Tibshirani R (2004) Least Angle Regression The Annals of Statistics (Volume 32) 2 407–499
Hastie T, Tibshirani R and Friedman J (2001) The Elements of Statistical Learning: Data Mining, Inference and Prediction Springer (New York)
Tibshirani R (1996) Regression Shrinkage and Selection via the Lasso Journal of the Royal Statistics Society, Series B (Methodological) (Volume 58) 1 267–288
Weisberg S (1985) Applied Linear Regression Wiley
5
Arguments
-
1:
– Integer
Input
-
On entry:
, the number of steps carried out in the model fitting process, as returned by
g02maf and
g02mbf.
Constraint:
.
-
2:
– Integer
Input
-
On entry:
, number of parameter estimates, as returned by
g02maf and
g02mbf.
Constraint:
.
-
3:
– Real (Kind=nag_wp) array
Input
-
Note: the second dimension of the array
b
must be at least
.
On entry:
the parameter estimates, as returned by
g02maf and
g02mbf, with
, the parameter estimate for the
th variable, for
, at the
th step of the model fitting process.
Constraint:
b should be unchanged since the last call to
g02maf or
g02mbf.
-
4:
– Integer
Input
-
On entry: the first dimension of the array
b as declared in the (sub)program from which
g02mcf is called.
Constraint:
.
-
5:
– Real (Kind=nag_wp) array
Input
-
On entry: summaries of the model fitting process, as returned by
g02maf and
g02mbf.
Constraint:
fitsum should be unchanged since the last call to
g02maf or
g02mbf..
-
6:
– Integer
Input
-
On entry: indicates what target values are held in
nk.
- nk holds (fractional) LARS step numbers.
- nk holds values for norm of the (scaled) parameters.
- nk holds ratios with respect to the largest (scaled) norm.
- nk holds values for the norm of the (unscaled) parameters.
- nk holds ratios with respect to the largest (unscaled) norm.
If
g02maf was called with
or
or
g02mbf was called with
then the model fitting routine did not rescale the independent variables,
, prior to fitting the model and therefore there is no difference between
or
and
or
.
Constraint:
, , , or .
-
7:
– Real (Kind=nag_wp) array
Input
-
On entry: target values used for predicting the new set of parameter estimates.
Constraints:
- if , , for ;
- if , , for ;
- if or , , for ;
- if , , for .
-
8:
– Integer
Input
-
On entry: number of values supplied in
nk.
Constraint:
.
-
9:
– Real (Kind=nag_wp) array
Output
-
Note: the second dimension of the array
nb
must be at least
.
On exit: the predicted parameter estimates, with , the parameter estimate for variable , at the point in the fitting process associated with , .
-
10:
– Integer
Input
-
On entry: the first dimension of the array
nb as declared in the (sub)program from which
g02mcf is called.
Constraint:
.
-
11:
– Integer
Input/Output
-
On entry:
ifail must be set to
,
or
to set behaviour on detection of an error; these values have no effect when no error is detected.
A value of causes the printing of an error message and program execution will be halted; otherwise program execution continues. A value of means that an error message is printed while a value of means that it is not.
If halting is not appropriate, the value
or
is recommended. If message printing is undesirable, then the value
is recommended. Otherwise, the value
is recommended since useful values can be provided in some output arguments even when
on exit.
When the value or is used it is essential to test the value of ifail on exit.
On exit:
unless the routine detects an error or a warning has been flagged (see
Section 6).
6
Error Indicators and Warnings
If on entry
or
, explanatory error messages are output on the current error message unit (as defined by
x04aaf).
Errors or warnings detected by the routine:
Note: in some cases g02mcf may return useful information.
-
On entry, .
Constraint: .
-
On entry, .
Constraint: .
-
b has been corrupted since the last call to
g02maf or
g02mbf.
-
On entry, and
Constraint: .
-
fitsum has been corrupted since the last call to
g02maf or
g02mbf.
-
On entry, .
Constraint: , , , or .
-
On entry, , and
Constraint: , for all .
-
On entry, , , and .
Constraint: , for all .
-
On entry, or , .
Constraint: , for all .
-
On entry, , and
Constraint: , for all .
-
On entry, .
Constraint: .
-
On entry, and .
Constraint: .
An unexpected error has been triggered by this routine. Please
contact
NAG.
See
Section 7 in the Introduction to the NAG Library FL Interface for further information.
Your licence key may have expired or may not have been installed correctly.
See
Section 8 in the Introduction to the NAG Library FL Interface for further information.
Dynamic memory allocation failed.
See
Section 9 in the Introduction to the NAG Library FL Interface for further information.
7
Accuracy
Not applicable.
8
Parallelism and Performance
g02mcf is threaded by NAG for parallel execution in multithreaded implementations of the NAG Library.
g02mcf makes calls to BLAS and/or LAPACK routines, which may be threaded within the vendor library used by this implementation. Consult the documentation for the vendor library for further information.
Please consult the
X06 Chapter Introduction for information on how to control and interrogate the OpenMP environment used within this routine. Please also consult the
Users' Note for your implementation for any additional implementation-specific information.
None.
10
Example
This example performs a LARS on a set a simulated dataset with observations and independent variables.
Additional parameter estimates are obtained corresponding to a LARS step number of and . Where, for example, corresponds to the solution halfway between that obtained at step and that obtained at step .
10.1
Program Text
10.2
Program Data
10.3
Program Results