naginterfaces.library.correg.linregm_​fit_​onestep

naginterfaces.library.correg.linregm_fit_onestep(istep, x, vname, isx, y, model, nterm, rss, idf, ifr, free, q, p, mean='M', wt=None, fin=2.0)[source]

linregm_fit_onestep carries out one step of a forward selection procedure in order to enable the ‘best’ linear regression model to be found.

For full information please refer to the NAG Library document for g02ee

https://support.nag.com/numeric/nl/nagdoc_30.1/flhtml/g02/g02eef.html

Parameters
istepint

Indicates which step in the forward selection process is to be carried out.

The process is initialized.

xfloat, array-like, shape

must contain the th observation for the th independent variable, for , for .

vnamestr, array-like, shape

must contain the name of the independent variable in column of , for .

isxint, array-like, shape

Indicates which independent variables could be considered for inclusion in the regression.

The variable contained in the th column of is automatically included in the regression model, for .

The variable contained in the th column of is considered for inclusion in the regression model, for .

The variable in the th column is not considered for inclusion in the model, for .

yfloat, array-like, shape

The dependent variable.

modelstr, array-like, shape

If , need not be set.

If , must contain the values returned by the previous call to linregm_fit_onestep.

ntermint

If , need not be set.

If , must contain the value returned by the previous call to linregm_fit_onestep.

rssfloat

If , need not be set.

If , must contain the value returned by the previous call to linregm_fit_onestep.

idfint

If , need not be set.

If , must contain the value returned by the previous call to linregm_fit_onestep.

ifrint

If , need not be set.

If , must contain the value returned by the previous call to linregm_fit_onestep.

freestr, array-like, shape

If , need not be set.

If , must contain the values returned by the previous call to linregm_fit_onestep.

qfloat, array-like, shape

If , need not be set.

If , must contain the values returned by the previous call to linregm_fit_onestep.

pfloat, array-like, shape

If , need not be set.

If , must contain the values returned by the previous call to linregm_fit_onestep.

meanstr, length 1, optional

Indicates if a mean term is to be included.

A mean term, intercept, will be included in the model.

The model will pass through the origin, zero-point.

wtNone or float, array-like, shape , optional

If provided must contain the weights to be used with the model.

If , the th observation is not included in the model, in which case the effective number of observations is the number of observations with nonzero weights.

If is not provided the effective number of observations is .

finfloat, optional

The critical value of the statistic for the term to be included in the model, .

Returns
istepint

Is incremented by .

addvarbool

Indicates if a variable has been added to the model.

A variable has been added to the model.

No variable had an value greater than and none were added to the model.

newvarstr

If , contains the name of the variable added to the model.

chrssfloat

If , contains the change in the residual sum of squares due to adding variable .

ffloat

If , contains the statistic for the inclusion of the variable in .

modelstr, ndarray, shape

The names of the variables in the current model.

ntermint

The number of independent variables in the current model, not including the mean, if any.

rssfloat

The residual sums of squares for the current model.

idfint

The degrees of freedom for the residual sum of squares for the current model.

ifrint

The number of free independent variables, i.e., the number of variables not in the model that are still being considered for selection.

freestr, ndarray, shape

The first values of contain the names of the free variables.

exssfloat, ndarray, shape

The first values of contain what would be the change in regression sum of squares if the free variables had been added to the model, i.e., the extra sum of squares for the free variables. contains what would be the change in regression sum of squares if the variable had been added to the model.

qfloat, ndarray, shape

The results of the decomposition for the current model:

the first column of contains (or where is the vector of weights if used);

the upper triangular part of columns to contain the matrix;

the strictly lower triangular part of columns to contain details of the matrix;

the remaining to columns of contain (or ),

where , or if .

pfloat, ndarray, shape

The first elements of contain details of the decomposition, where , or if .

Raises
NagValueError
(errno )

On entry, .

Constraint: .

(errno )

On entry, .

Constraint: .

(errno )

On entry, and .

Constraint: if , .

(errno )

On entry, .

Constraint: .

(errno )

On entry, .

Constraint: or .

(errno )

On entry, .

Constraint: or .

(errno )

On entry, .

Constraint: .

(errno )

On entry, .

Constraint: .

(errno )

On entry, .

Constraint: , for .

(errno )

On entry, number of forced variables .

(errno )

Degrees of freedom for error will equal if new variable is added, i.e., the number of variables in the model plus is equal to the effective number of observations.

(errno )

On entry, .

Constraint: must be large enough to accommodate the number of terms given by .

(errno )

On entry, .

Constraint: , for .

(errno )

On entry, , for all .

Constraint: at least one value of must be nonzero.

Warns
NagAlgorithmicWarning
(errno )

On entry, the variables forced into the model are not of full rank, i.e., some of these variables are linear combinations of others.

(errno )

There are no free variables, i.e., no element of .

(errno )

The value of the change in the sum of squares is greater than the input value of . This may occur due to rounding errors if the true residual sum of squares for the new model is small relative to the residual sum of squares for the previous model.

Notes

One method of selecting a linear regression model from a given set of independent variables is by forward selection. The following procedure is used:

  1. Select the best fitting independent variable, i.e., the independent variable which gives the smallest residual sum of squares. If the -test for this variable is greater than a chosen critical value, , then include the variable in the model, else stop.

  2. Find the independent variable that leads to the greatest reduction in the residual sum of squares when added to the current model.

  3. If the -test for this variable is greater than a chosen critical value, , then include the variable in the model and go to (2), otherwise stop.

At any step the variables not in the model are known as the free terms.

linregm_fit_onestep allows you to specify some independent variables that must be in the model, these are known as forced variables.

The computational procedure involves the use of decompositions, the and the matrices being updated as each new variable is added to the model. In addition the matrix , where is the matrix of variables not included in the model, is updated.

linregm_fit_onestep computes one step of the forward selection procedure at a call. The results produced at each step may be printed or used as inputs to linregm_update(), in order to compute the regression coefficients for the model fitted at that step. Repeated calls to linregm_fit_onestep should be made until is indicated.

References

Draper, N R and Smith, H, 1985, Applied Regression Analysis, (2nd Edition), Wiley

Weisberg, S, 1985, Applied Linear Regression, Wiley