naginterfaces.library.smooth.fit_spline_parest¶
- naginterfaces.library.smooth.fit_spline_parest(method, x, y, crit, wt=None, u=0.0, tol=0.0, maxcal=0)[source]¶
fit_spline_parest
estimates the values of the smoothing parameter and fits a cubic smoothing spline to a set of data.For full information please refer to the NAG Library document for g10ac
https://support.nag.com/numeric/nl/nagdoc_30.2/flhtml/g10/g10acf.html
- Parameters
- methodstr, length 1
Indicates whether the smoothing parameter is to be found by minimization of the CV or GCV functions, or by finding the smoothing parameter corresponding to a specified degrees of freedom value.
Cross-validation is used.
The degrees of freedom are specified.
Generalized cross-validation is used.
- xfloat, array-like, shape
The distinct and ordered values , for .
- yfloat, array-like, shape
The values , for .
- critfloat
If , the required degrees of freedom for the spline.
If or , need not be set.
- wtNone or float, array-like, shape , optional
If , must contain the weights. Otherwise is not referenced and unit weights are assumed.
- ufloat, optional
The upper bound on the smoothing parameter. If , will be used instead. See Further Comments for details on how this argument is used.
- tolfloat, optional
The accuracy to which the smoothing parameter is required. should preferably be not much less than , where is the machine precision. If , will be used instead.
- maxcalint, optional
The maximum number of spline evaluations to be used in finding the value of . If , will be used instead.
- Returns
- yhatfloat, ndarray, shape
The fitted values, , for .
- cfloat, ndarray, shape
The spline coefficients. More precisely, the value of the spline approximation at is given by , where and .
- rssfloat
The (weighted) residual sum of squares.
- dffloat
The residual degrees of freedom. If this will be to the required accuracy.
- resfloat, ndarray, shape
The (weighted) residuals, , for .
- hfloat, ndarray, shape
The leverages, , for .
- critfloat
If , the value of the cross-validation, or if , the value of the generalized cross-validation function, evaluated at the value of returned in .
- rhofloat
The smoothing parameter, .
- Raises
- NagValueError
- (errno )
On entry, .
Constraint: if , .
- (errno )
On entry, .
Constraint: if , .
- (errno )
On entry, is not valid: .
- (errno )
On entry, .
Constraint: .
- (errno )
On entry, at least one element of .
- (errno )
On entry, is not a strictly ordered array.
- (errno )
For the specified degrees of freedom, : .
- Warns
- NagAlgorithmicWarning
- (errno )
Accuracy of cannot be achieved: .
- (errno )
iterations have been performed.
- (errno )
Optimum value of lies above : .
- Notes
In the NAG Library the traditional C interface for this routine uses a different algorithmic base. Please contact NAG if you have any questions about compatibility.
For a set of observations , for , the spline provides a flexible smooth function for situations in which a simple polynomial or nonlinear regression model is not suitable.
Cubic smoothing splines arise as the unique real-valued solution function , with absolutely continuous first derivative and squared-integrable second derivative, which minimizes
where is the (optional) weight for the th observation and is the smoothing parameter. This criterion consists of two parts: the first measures the fit of the curve and the second the smoothness of the curve. The value of the smoothing parameter weights these two aspects; larger values of give a smoother fitted curve but, in general, a poorer fit. For details of how the cubic spline can be fitted see Hutchinson and de Hoog (1985) and Reinsch (1967).
The fitted values, , and weighted residuals, , can be written as:
for a matrix . The residual degrees of freedom for the spline is and the diagonal elements of are the leverages.
The parameter can be estimated in a number of ways.
The degrees of freedom for the spline can be specified, i.e., find such that for given .
Minimize the cross-validation (CV), i.e., find such that the CV is minimized, where
Minimize the generalized cross-validation (GCV), i.e., find such that the GCV is minimized, where
fit_spline_parest
requires the to be strictly increasing. If two or more observations have the same value then they should be replaced by a single observation with equal to the (weighted) mean of the values and weight, , equal to the sum of the weights. This operation can be performed bydata_order()
.The algorithm is based on Hutchinson (1986).
roots.contfn_brent_rcomm
is used to solve for given and the method ofopt.one_var_func
is used to minimize the GCV or CV.
- References
Hastie, T J and Tibshirani, R J, 1990, Generalized Additive Models, Chapman and Hall
Hutchinson, M F, 1986, Algorithm 642: A fast procedure for calculating minimum cross-validation cubic smoothing splines, ACM Trans. Math. Software (12), 150–153
Hutchinson, M F and de Hoog, F R, 1985, Smoothing noisy data with spline functions, Numer. Math. (47), 99–106
Reinsch, C H, 1967, Smoothing by spline functions, Numer. Math. (10), 177–183