naginterfaces.library.smooth.fit_​spline

naginterfaces.library.smooth.fit_spline(mode, x, y, rho, c, comm, wt=None)[source]

fit_spline fits a cubic smoothing spline for a given smoothing parameter.

For full information please refer to the NAG Library document for g10ab

https://support.nag.com/numeric/nl/nagdoc_30/flhtml/g10/g10abf.html

Parameters
modestr, length 1

Indicates in which mode the function is to be used.

Initialization and fitting is performed. This partial fit can be used in an iterative weighted least squares context where the weights are changing at each call to fit_spline or when the coefficients are not required.

Fitting only is performed. Initialization must have been performed previously by a call to fit_spline with . This quick fit may be called repeatedly with different values of without re-initialization.

Initialization and full fitting is performed and the function coefficients are calculated.

xfloat, array-like, shape

The distinct and ordered values , for .

yfloat, array-like, shape

The values , for .

rhofloat

, the smoothing parameter.

cfloat, array-like, shape

If , must be unaltered from the previous call to fit_spline with . Otherwise need not be set.

commdict, communication object, modified in place

Communication structure.

On initial entry: need not be set.

wtNone or float, array-like, shape , optional

If , must contain the weights. Otherwise is not referenced and unit weights are assumed.

Returns
yhatfloat, ndarray, shape

The fitted values, , for .

cfloat, ndarray, shape

If , contains the spline coefficients. More precisely, the value of the spline at is given by , where and .

If or , contains information that will be used in a subsequent call to fit_spline with .

rssfloat

The (weighted) residual sum of squares.

dffloat

The residual degrees of freedom.

resfloat, ndarray, shape

The (weighted) residuals, , for .

hfloat, ndarray, shape

The leverages, , for .

Raises
NagValueError
(errno )

On entry, .

Constraint: , or .

(errno )

On entry, .

Constraint: .

(errno )

On entry, .

Constraint: .

(errno )

On entry, at least one element of .

(errno )

On entry, is not a strictly ordered array.

Notes

fit_spline fits a cubic smoothing spline to a set of observations (, ), for . The spline provides a flexible smooth function for situations in which a simple polynomial or nonlinear regression model is unsuitable.

Cubic smoothing splines arise as the unique real-valued solution function , with absolutely continuous first derivative and squared-integrable second derivative, which minimizes:

where is the (optional) weight for the th observation and is the smoothing parameter. This criterion consists of two parts: the first measures the fit of the curve, and the second the smoothness of the curve. The value of the smoothing parameter weights these two aspects; larger values of give a smoother fitted curve but, in general, a poorer fit. For details of how the cubic spline can be estimated see Hutchinson and de Hoog (1985) and Reinsch (1967).

The fitted values, , and weighted residuals, , can be written as

for a matrix . The residual degrees of freedom for the spline is and the diagonal elements of , , are the leverages.

The parameter can be chosen in a number of ways. The fit can be inspected for a number of different values of . Alternatively the degrees of freedom for the spline, which determines the value of , can be specified, or the (generalized) cross-validation can be minimized to give ; see fit_spline_parest() for further details.

fit_spline requires the to be strictly increasing. If two or more observations have the same -value then they should be replaced by a single observation with equal to the (weighted) mean of the values and weight, , equal to the sum of the weights. This operation can be performed by data_order().

The computation is split into three phases.

  1. Compute matrices needed to fit spline.

  2. Fit spline for a given value of .

  3. Compute spline coefficients.

When fitting the spline for several different values of , phase (1) need only be carried out once and then phase (2) repeated for different values of . If the spline is being fitted as part of an iterative weighted least squares procedure phases (1) and (2) have to be repeated for each set of weights. In either case, phase (3) will often only have to be performed after the final fit has been computed.

The algorithm is based on Hutchinson (1986).

References

Hastie, T J and Tibshirani, R J, 1990, Generalized Additive Models, Chapman and Hall

Hutchinson, M F, 1986, Algorithm 642: A fast procedure for calculating minimum cross-validation cubic smoothing splines, ACM Trans. Math. Software (12), 150–153

Hutchinson, M F and de Hoog, F R, 1985, Smoothing noisy data with spline functions, Numer. Math. (47), 99–106

Reinsch, C H, 1967, Smoothing by spline functions, Numer. Math. (10), 177–183