naginterfaces.library.correg.pls_​wold

naginterfaces.library.correg.pls_wold(x, isx, y, iscale, xstd, ystd, maxfac, maxit=200, tau=0.0001, io_manager=None)[source]

pls_wold fits an orthogonal scores partial least squares (PLS) regression by using Wold’s iterative method.

For full information please refer to the NAG Library document for g02lb

https://support.nag.com/numeric/nl/nagdoc_30.2/flhtml/g02/g02lbf.html

Parameters
xfloat, array-like, shape

must contain the th observation on the th predictor variable, for , for .

isxint, array-like, shape

Indicates which predictor variables are to be included in the model.

The th predictor variable (with variates in the th column of ) is included in the model.

Otherwise.

yfloat, array-like, shape

must contain the th observation for the th response variable, for , for .

iscaleint

Indicates how predictor variables are scaled.

Data are scaled by the standard deviation of variables.

Data are scaled by user-supplied scalings.

No scaling.

xstdfloat, array-like, shape

If , must contain the user-supplied scaling for the th predictor variable in the model, for . Otherwise need not be set.

ystdfloat, array-like, shape

If , must contain the user-supplied scaling for the th response variable in the model, for . Otherwise need not be set.

maxfacint

, the number of latent variables to calculate.

maxitint, optional

If , is not referenced; otherwise the maximum number of iterations used to calculate the -weights.

taufloat, optional

If , is not referenced; otherwise the iterative procedure used to calculate the -weights will halt if the Euclidean distance between two subsequent estimates is less than or equal to .

io_managerFileObjManager, optional

Manager for I/O in this routine.

Returns
xbarfloat, ndarray, shape

Mean values of predictor variables in the model.

ybarfloat, ndarray, shape

The mean value of each response variable.

xstdfloat, ndarray, shape

If , standard deviations of predictor variables in the model. Otherwise is not changed.

ystdfloat, ndarray, shape

If , the standard deviation of each response variable. Otherwise is not changed.

xresfloat, ndarray, shape

The predictor variables’ residual matrix .

yresfloat, ndarray, shape

The residuals for each response variable, .

wfloat, ndarray, shape

The th column of contains the -weights , for .

pfloat, ndarray, shape

The th column of contains the -loadings , for .

tfloat, ndarray, shape

The th column of contains the -scores , for .

cfloat, ndarray, shape

The th column of contains the -loadings , for .

ufloat, ndarray, shape

The th column of contains the -scores , for .

xcvfloat, ndarray, shape

contains the cumulative percentage of variance in the predictor variables explained by the first factors, for .

ycvfloat, ndarray, shape

is the cumulative percentage of variance of the th response variable explained by the first factors, for , for .

Raises
NagValueError
(errno )

On entry, .

Constraint: .

(errno )

On entry, .

Constraint: .

(errno )

On entry, is invalid.

Constraint: or , for all .

(errno )

On entry, .

Constraint: .

(errno )

On entry, .

Constraint: or .

(errno )

On entry, and .

Constraint: .

(errno )

On entry, and .

Constraint: .

(errno )

On entry, and .

Constraint: if , .

(errno )

On entry, .

Constraint: if , .

(errno )

On entry, and .

Constraint: the sum of elements in must equal .

Notes

Let be the mean-centred data matrix of observations on predictor variables. Let be the mean-centred data matrix of observations on response variables.

The first of the factors PLS methods extract from the data predicts both and by regressing on a column vector of scores:

where the column vectors of -loadings and -loadings are calculated in the least squares sense:

The -score vector is the linear combination of predictor data that has maximum covariance with the -scores , where the -weights vector is the normalised first left singular vector of .

The method extracts subsequent PLS factors by repeating the above process with the residual matrices:

and with orthogonal scores:

Optionally, in addition to being mean-centred, the data matrices and may be scaled by standard deviations of the variables. If data are supplied mean-centred, the calculations are not affected within numerical accuracy.

References

Wold, H, 1966, Estimation of principal components and related models by iterative least squares, In: Multivariate Analysis, (ed P R Krishnaiah), 391–420, Academic Press NY