NAG CL Interface
g10abc (fit_spline)
1
Purpose
g10abc fits a cubic smoothing spline for a given smoothing parameter.
2
Specification
void |
g10abc (Nag_SmoothFitType mode,
Integer n,
const double x[],
const double y[],
const double wt[],
double rho,
double yhat[],
double c[],
Integer pdc,
double *rss,
double *df,
double res[],
double h[],
double comm[],
NagError *fail) |
|
The function may be called by the names: g10abc, nag_smooth_fit_spline or nag_smooth_spline_fit.
3
Description
g10abc fits a cubic smoothing spline to a set of observations (, ), for . The spline provides a flexible smooth function for situations in which a simple polynomial or nonlinear regression model is unsuitable.
Cubic smoothing splines arise as the unique real-valued solution function
, with absolutely continuous first derivative and squared-integrable second derivative, which minimizes:
where
is the (optional) weight for the
th observation and
is the smoothing parameter. This criterion consists of two parts: the first measures the fit of the curve, and the second the smoothness of the curve. The value of the smoothing parameter
weights these two aspects; larger values of
give a smoother fitted curve but, in general, a poorer fit. For details of how the cubic spline can be estimated see
Hutchinson and de Hoog (1985) and
Reinsch (1967).
The fitted values,
, and weighted residuals,
, can be written as
for a matrix
. The residual degrees of freedom for the spline is
and the diagonal elements of
,
, are the leverages.
The parameter
can be chosen in a number of ways. The fit can be inspected for a number of different values of
. Alternatively the degrees of freedom for the spline, which determines the value of
, can be specified, or the (generalized) cross-validation can be minimized to give
; see
g10acc for further details.
g10abc requires the
to be strictly increasing. If two or more observations have the same
-value then they should be replaced by a single observation with
equal to the (weighted) mean of the
values and weight,
, equal to the sum of the weights. This operation can be performed by
g10zac.
The computation is split into three phases.
-
(i)Compute matrices needed to fit spline.
-
(ii)Fit spline for a given value of .
-
(iii)Compute spline coefficients.
When fitting the spline for several different values of
, phase
(i) need only be carried out once and then phase
(ii) repeated for different values of
. If the spline is being fitted as part of an iterative weighted least squares procedure phases
(i) and
(ii) have to be repeated for each set of weights. In either case, phase
(iii) will often only have to be performed after the final fit has been computed.
The algorithm is based on
Hutchinson (1986).
4
References
Hastie T J and Tibshirani R J (1990) Generalized Additive Models Chapman and Hall
Hutchinson M F (1986) Algorithm 642: A fast procedure for calculating minimum cross-validation cubic smoothing splines ACM Trans. Math. Software 12 150–153
Hutchinson M F and de Hoog F R (1985) Smoothing noisy data with spline functions Numer. Math. 47 99–106
Reinsch C H (1967) Smoothing by spline functions Numer. Math. 10 177–183
5
Arguments
-
1:
– Nag_SmoothFitType
Input
-
On entry: indicates in which mode the function is to be used.
- Initialization and fitting is performed. This partial fit can be used in an iterative weighted least squares context where the weights are changing at each call to g10abc or when the coefficients are not required.
- Fitting only is performed. Initialization must have been performed previously by a call to g10abc with . This quick fit may be called repeatedly with different values of rho without re-initialization.
- Initialization and full fitting is performed and the function coefficients are calculated.
Constraint:
, or .
-
2:
– Integer
Input
-
On entry: , the number of distinct observations.
Constraint:
.
-
3:
– const double
Input
-
On entry: the distinct and ordered values
, for .
Constraint:
, for .
-
4:
– const double
Input
-
On entry: the values
, for .
-
5:
– const double
Input
-
Note: the dimension,
dim, of the array
wt
must be at least
.
On entry: optionally, the
weights.
If weights are not provided then
wt must be set to
NULL, in which case unit weights are assumed.
Constraint:
, for .
-
6:
– double
Input
-
On entry: , the smoothing parameter.
Constraint:
.
-
7:
– double
Output
-
On exit: the fitted values,
, for .
-
8:
– double
Input/Output
-
Note: the th element of the matrix is stored in .
On entry: if
,
c must be unaltered from the previous call to
g10abc with
. Otherwise
c need not be set.
On exit: if
,
c contains the spline coefficients. More precisely, the value of the spline at
is given by
, where
and
.
If
or
,
c contains information that will be used in a subsequent call to
g10abc with
.
-
9:
– Integer
Input
-
On entry: the stride separating matrix row elements in the array
c.
Constraint:
.
-
On exit: the (weighted) residual sum of squares.
-
11:
– double *
Output
-
On exit: the residual degrees of freedom.
-
12:
– double
Output
-
On exit: the (weighted) residuals,
, for .
-
13:
– double
Output
-
On exit: the leverages,
, for .
-
14:
– double
Communication Array
-
On entry: if
,
comm must be unaltered from the previous call to
g10abc with
. Otherwise
comm need not be set.
On exit: if
or
,
comm contains information that will be used in a subsequent call to
g10abc with
.
-
15:
– NagError *
Input/Output
-
The NAG error argument (see
Section 7 in the Introduction to the NAG Library CL Interface).
6
Error Indicators and Warnings
- NE_ALLOC_FAIL
-
Dynamic memory allocation failed.
See
Section 3.1.2 in the Introduction to the NAG Library CL Interface for further information.
- NE_ARRAY_SIZE
-
On entry, and .
Constraint: .
- NE_BAD_PARAM
-
On entry, argument had an illegal value.
- NE_INT
-
On entry, .
Constraint: .
- NE_INTERNAL_ERROR
-
An internal error has occurred in this function. Check the function call and any array sizes. If the call is correct then please contact
NAG for assistance.
See
Section 7.5 in the Introduction to the NAG Library CL Interface for further information.
- NE_NO_LICENCE
-
Your licence key may have expired or may not have been installed correctly.
See
Section 8 in the Introduction to the NAG Library CL Interface for further information.
- NE_NOT_STRICTLY_INCREASING
-
On entry,
x is not a strictly ordered array.
- NE_REAL
-
On entry, .
Constraint: .
- NE_WEIGHTS_NOT_POSITIVE
-
On entry, at least one element of .
7
Accuracy
Accuracy depends on the value of and the position of the values. The values of and are scaled and is transformed to avoid underflow and overflow problems.
8
Parallelism and Performance
g10abc is not threaded in any implementation.
The time taken by g10abc is of order .
Regression splines with a small
number of knots can be fitted by
e02bac and
e02bec.
10
Example
The data, given by
Hastie and Tibshirani (1990), is the age,
, and C-peptide concentration (pmol/ml),
, from a study of the factors affecting insulin-dependent diabetes mellitus in children. The data is input, reduced to a strictly ordered set by
g10zac and a series of splines fit using a range of values for the smoothing parameter,
.
10.1
Program Text
10.2
Program Data
10.3
Program Results