plane. The knots of the spline are located automatically, but a single argument must be specified to control the trade-off between closeness of fit and smoothness of fit.

2 Specification

#include <nag.h>

void	e02dcc (Nag_Start start, Integer mx, const double x[], Integer my, const double y[], const double f[], double s, Integer nxest, Integer nyest, double fp, Nag_Comm warmstartinf, Nag_2dSpline spline, NagError fail)

The function may be called by the names: e02dcc, nag_fit_dim2_spline_grid or nag_2d_spline_fit_grid.

3 Description

e02dcc determines a smooth bicubic spline approximation

s (x, y)

to the set of data points

(x_{q}, y_{r}, f_{q, r})

, for

q = 1, 2, \dots, m_{x}

and

r = 1, 2, \dots, m_{y}

The spline is given in the B-spline representation

s (x, y) = \sum_{i = 1}^{n_{x} - 4} \sum_{j = 1}^{n_{y} - 4} c_{i j} M_{i} (x) N_{j} (y),

(1)

where

M_{i} (x)

and

N_{j} (y)

denote normalized cubic B-splines, the former defined on the knots

λ_{i}

λ_{i + 4}

and the latter on the knots

μ_{j}

μ_{j + 4}

. For further details, see Hayes and Halliday (1974) for bicubic splines and de Boor (1972) for normalized B-splines.

The total numbers

n_{x}

and

n_{y}

of these knots and their values

λ_{1}, \dots, λ_{n_{x}}

and

μ_{1}, \dots, μ_{n_{y}}

are chosen automatically by the function. The knots

λ_{5}, \dots, λ_{n_{x} - 4}

and

μ_{5}, \dots, μ_{n_{y} - 4}

are the interior knots; they divide the approximation domain

[x_{1}, x_{m_{x}}] \times [y_{1}, y_{m_{y}}]

into

(n_{x} - 7) \times (n_{y} - 7)

subpanels

[λ_{i}, λ_{i + 1}] \times [μ_{i}, μ_{i + 1}]

, for

i = 4, 5, \dots, n_{x} - 4

and

j = 4, 5, \dots, n_{y} - 4

. Then, much as in the curve case (see e02bec), the coefficients

c_{i j}

are determined as the solution of the following constrained minimization problem:

minimize η,

(2)

subject to the constraint

θ = \sum_{q = 1}^{m_{x}} \sum_{r = 1}^{m_{y}} ε_{q, r}^{2} \leq S,

(3)

where

η

is a measure of the (lack of) smoothness of

s (x, y)

. Its value depends on the discontinuity jumps in

s (x, y)

across the boundaries of the subpanels. It is zero only when there are no discontinuities and is positive otherwise, increasing with the size of the jumps (see Dierckx (1982) for details).

ε_{q, r}

denotes the residual

f_{q, r} - s (x_{q}, y_{r})

, and

S

is a non-negative number to be specified.

By means of the argument

S

, ‘the smoothing factor’, you will then control the balance between smoothness and closeness of fit, as measured by the sum of squares of residuals in (3). If

S

is too large, the spline will be too smooth and signal will be lost (underfit); if

S

is too small, the spline will pick up too much noise (overfit). In the extreme cases the function will return an interpolating spline

(θ = 0)

S

is set to zero, and the least squares bicubic polynomial (

η = 0

) if

S

is set very large. Experimenting with

S

values between these two extremes should result in a good compromise. (See Section 9.3 for advice on choice of

S

The method employed is outlined in Section 9.5 and fully described in Dierckx (1981) and Dierckx (1982). It involves an adaptive strategy for locating the knots of the bicubic spline (depending on the function underlying the data and on the value of

S

), and an iterative method for solving the constrained minimization problem once the knots have been determined.

Values and derivatives of the computed spline can subsequently be computed by calling e02dec, e02dfc and e02dhc as described in Section 9.6.

4 References

de Boor C (1972) On calculating with B-splines J. Approx. Theory 6 50–62

Dierckx P (1981) An improved algorithm for curve fitting with spline functions Report TW54 Department of Computer Science, Katholieke Univerciteit Leuven

Dierckx P (1982) A fast algorithm for smoothing data on a rectangular grid while using spline functions SIAM J. Numer. Anal. 19 1286–1304

Hayes J G and Halliday J (1974) The least squares fitting of cubic spline surfaces to general data sets J. Inst. Math. Appl. 14 89–103

Reinsch C H (1967) Smoothing by spline functions Numer. Math. 10 177–183

5 Arguments

1: $start$ – Nag_Start Input

On entry: start must be set to

start = Nag_Cold

Nag_Warm

$start = Nag_Cold$ , (cold start): The function will build up the knot set starting with no interior knots. No values need be assigned to $spline \to nx$ and $spline \to ny$ and memory will be internally allocated to $spline \to lamda$ , $spline \to mu$ , $spline \to c$ , $warmstartinf \to nag_w$ and $warmstartinf \to nag_iw$ .
$start = Nag_Warm$ (warm start): The function will restart the knot-placing strategy using the knots found in a previous call of the function. In this case, all arguments except s must be unchanged from that previous call. This warm start can save much time in searching for a satisfactory value of $S$ .

Constraint:

start = Nag_Cold

Nag_Warm

2: $mx$ – Integer Input

On entry:

m_{x}

, the number of grid points along the

x

axis.

Constraint:

mx \geq 4

3: $x [mx]$ – const double Input

On entry:

x [q - 1]

must be set to

x_{q}

, the

x

coordinate of the

q

th grid point along the

x

axis, for

q = 1, 2, \dots, m_{x}

Constraint:

x_{1} < x_{2} < \dots < x_{m_{x}}

4: $my$ – Integer Input

On entry:

m_{y}

, the number of grid points along the

y

axis.

Constraint:

my \geq 4

5: $y [my]$ – const double Input

On entry:

y [r - 1]

must be set to

y_{r}

, the

y

coordinate of the

r

th grid point along the

y

axis, for

r = 1, 2, \dots, m_{y}

Constraint:

y_{1} < y_{2} < \dots < y_{m_{y}}

6: $f [mx \times my]$ – const double Input

On entry:

f [m_{y} \times (q - 1) + r - 1]

must contain the data value

f_{q, r}

, for

q = 1, 2, \dots, m_{x}

and

r = 1, 2, \dots, m_{y}

7: $s$ – double Input

On entry: the smoothing factor,

S

S = 0.0

, the function returns an interpolating spline.

S

is smaller than machine precision, it is assumed equal to zero.

For advice on the choice of

S

, see Section 3 and Section 9.2.

Constraint:

s \geq 0.0

8: $nxest$ – Integer Input

9: $nyest$ – Integer Input

On entry: an upper bound for the number of knots

n_{x}

and

n_{y}

required in the

x

and

y

directions respectively.

In most practical situations,

nxest = m_{x} / 2

and

nyest = m_{y} / 2

is sufficient. nxest and nyest never need to be larger than

m_{x} + 4

and

m_{y} + 4

respectively, the numbers of knots needed for interpolation (

S = 0.0

). See also Section 9.4.

Constraint:

nxest \geq 8

and

nyest \geq 8

10: $fp$ – double * Output

On exit: the sum of squared residuals,

θ

, of the computed spline approximation. If

fp = 0.0

, this is an interpolating spline. fp should equal

S

within a relative tolerance of

0.001

unless

spline \to nx = spline \to ny = 8

, when the spline has no interior knots and so is simply a bicubic polynomial. For knots to be inserted,

S

must be set to a value below the value of fp produced in this case.

11: $warmstartinf$ – Nag_Comm *

Pointer to structure of type Nag_Comm with the following members:

nag_w – double *Input: On entry: if the warm start option is used, the values $nag_w [0], \dots, nag_w [3]$ must be left unchanged from the previous call.

nag_iw – Integer *Input: On entry: if the warm start option is used, the values $nag_iw [0], \dots, nag_iw [2]$ must be left unchanged from the previous call.

Note that when the information contained in the pointers

nag_w

and

nag_iw

is no longer of use, or before a new call to e02dcc with the same warmstartinf, you should free this storage using the NAG macros NAG_FREE. This storage will have been allocated only if this function returns with

fail . code = NE_NOERROR

, NE_SPLINE_COEFF_CONV, or NE_NUM_KNOTS_2D_GT_RECT.

12: $spline$ – Nag_2dSpline *

Pointer to structure of type Nag_2dSpline with the following members:

nx – IntegerInput/Output: On entry: if the warm start option is used, the value of $nx$ must be left unchanged from the previous call.

On exit: the total number of knots, $n_{x}$ , of the computed spline with respect to the $x$ variable.

lamda – double *Input/Output: On entry: a pointer to which if $start = Nag_Cold$ , memory of size nxest is internally allocated. If the warm start option is used, the values $lamda [0], lamda [1], \dots, lamda [nx - 1]$ must be left unchanged from the previous call.

On exit: $lamda$ contains the complete set of knots $λ_{i}$ associated with the $x$ variable, i.e., the interior knots $lamda [4], lamda [5], \dots, lamda [nx - 5]$ as well as the additional knots $lamda [0] = lamda [1] = lamda [2] = lamda [3] = x [0]$ and $lamda [nx - 4] = lamda [nx - 3] = lamda [nx - 2] = lamda [nx - 1] = x [mx - 1]$ needed for the B-spline representation.

ny – IntegerInput/Output: On entry: if the warm start option is used, the value of $ny$ must be left unchanged from the previous call.

On exit: the total number of knots, $n_{y}$ , of the computed spline with respect to the $y$ variable.

mu – double *Input/Output: On entry: a pointer to which if $start = Nag_Cold$ , memory of size nyest is internally allocated. If the warm start option is used, the values $mu [0], mu [1], \dots, mu [ny - 1]$ must be left unchanged from the previous call.

On exit: $mu$ contains the complete set of knots $μ_{i}$ associated with the $y$ variable, i.e., the interior knots $mu [4]$ , $mu [5]$ , $\dots$ , $mu [ny - 5]$ as well as the additional knots $mu [0] = mu [1] = mu [2] = mu [3] = y [0]$ and $mu [ny - 4] = mu [ny - 3] = mu [ny - 2] = mu [ny - 1] = y [my - 1]$ needed for the B-spline representation.

c – double *Output: On exit: a pointer to which if $start = Nag_Cold$ , memory of size $(nxest - 4) \times (nyest - 4)$ is internally allocated. $c [(n_{y} - 4) \times (i - 1) + j - 1]$ is the coefficient $c_{i j}$ defined in Section 3.

Note that when the information contained in the pointers

lamda

mu

and

c

is no longer of use, or before a new call to e02dcc with the same spline, you should free this storage using the NAG macro NAG_FREE. This storage will have been allocated only if this function returns with

fail . code = NE_NOERROR

, NE_SPLINE_COEFF_CONV, or NE_NUM_KNOTS_2D_GT_RECT.

13: $fail$ – NagError * Input/Output

The NAG error argument (see Section 7 in the Introduction to the NAG Library CL Interface).

6 Error Indicators and Warnings

NE_ALLOC_FAIL: Dynamic memory allocation failed.
NE_BAD_PARAM: On entry, argument start had an illegal value.
NE_ENUMTYPE_WARM: $start = Nag_Warm$ at the first call of this function. start must be set to $start = Nag_Cold$ at the first call.
NE_INT_ARG_LT: On entry, $mx = ⟨ value ⟩$ .
Constraint: $mx \geq 4$ .

On entry, $my = ⟨ value ⟩$ .
Constraint: $my \geq 4$ .

On entry, $nxest = ⟨ value ⟩$ .
Constraint: $nxest \geq 8$ .

On entry, $nyest = ⟨ value ⟩$ .
Constraint: $nyest \geq 8$ .
NE_NOT_STRICTLY_INCREASING: The sequence x is not strictly increasing: $x [⟨ value ⟩] = ⟨ value ⟩$ , $x [⟨ value ⟩] = ⟨ value ⟩$ .
The sequence y is not strictly increasing: $y [⟨ value ⟩] = ⟨ value ⟩$ , $y [⟨ value ⟩] = ⟨ value ⟩$ .
NE_NUM_KNOTS_2D_GT_RECT: The number of knots required is greater than allowed by nxest or nyest, $nxest = ⟨ value ⟩$ , $nyest = ⟨ value ⟩$ . Possibly s is too small, especially if nxest, $nyest > mx / 2$ , $my / 2$ . $s = ⟨ value ⟩$ , $mx = ⟨ value ⟩$ , $my = ⟨ value ⟩$ . A spline approximation is returned, but it fails to satisfy the fitting criterion (see (2) and (3)) – perhaps by only a small amount, however.
NE_REAL_ARG_LT: On entry, s must not be less than 0.0: $s = ⟨ value ⟩$ .
NE_SF_D_K_CONS: On entry, $s = ⟨ value ⟩$ , $nxest = ⟨ value ⟩$ , $mx = ⟨ value ⟩$ .
Constraint: $nxest \geq mx + 4$ when $s = 0.0$ .

On entry, $s = ⟨ value ⟩$ , $nyest = ⟨ value ⟩$ , $my = ⟨ value ⟩$ .
Constraint: $nyest \geq mx + 4$ when $s = 0.0$ .
NE_SPLINE_COEFF_CONV: The iterative process has failed to converge. Possibly s is too small: $s = ⟨ value ⟩$ . A spline approximation is returned, but it fails to satisfy the fitting criterion (see (2) and (3)) – perhaps by only a small amount, however.

7 Accuracy

On successful exit, the approximation returned is such that its sum of squared residuals fp is equal to the smoothing factor

S

, up to a specified relative tolerance of

0.001

– except that if

n_{x} = 8

and

n_{y} = 8

, fp may be significantly less than

S

: in this case the computed spline is simply the least squares bicubic polynomial approximation of degree

3

, i.e., a spline with no interior knots.

8 Parallelism and Performance

Background information to multithreading can be found in the Multithreading documentation.

e02dcc is not threaded in any implementation.

9 Further Comments

9.1 Timing

The time taken for a call of e02dcc depends on the complexity of the shape of the data, the value of the smoothing factor

S

, and the number of data points. If e02dcc is to be called for different values of

S

, much time can be saved by setting

start = Nag_Warm

after the first call.

9.2 Weighting of Data Points

e02dcc does not allow individual weighting of the data values. If these were determined to widely differing accuracies, it may be better to use e02ddc. The computation time would be very much longer, however.

9.3 Choice of s

If the standard deviation of

f_{q, r}

is the same for all

q

and

r

(the case for which this function is designed – see Section 9.2) and known to be equal, at least approximately, to

σ

, say, then following Reinsch (1967) and choosing the smoothing factor

S

in the range

σ^{2} (m \pm \sqrt{2 m})

, where

m = m_{x} m_{y}

, is likely to give a good start in the search for a satisfactory value. If the standard deviations vary, the sum of their squares over all the data points could be used. Otherwise experimenting with different values of

S

will be required from the start, taking account of the remarks in Section 3.

In that case, in view of computation time and memory requirements, it is recommended to start with a very large value for

S

and so determine the least squares bicubic polynomial; the value returned for fp, call it

{fp}_{0}

, gives an upper bound for

S

. Then progressively decrease the value of

S

to obtain closer fits – say by a factor of 10 in the beginning, i.e.,

S = {fp}_{0} / 10

S = {fp}_{0} / 100

, and so on, and more carefully as the approximation shows more details.

The number of knots of the spline returned, and their location, generally depend on the value of

S

and on the behaviour of the function underlying the data. However, if e02dcc is called with

start = Nag_Warm

, the knots returned may also depend on the smoothing factors of the previous calls. Therefore, if, after a number of trials with different values of

S

and

start = Nag_Warm

, a fit can finally be accepted as satisfactory, it may be worthwhile to call e02dcc once more with the selected value for

S

but now using

start = Nag_Cold

. Often, e02dcc then returns an approximation with the same quality of fit but with fewer knots, which is, therefore, better if data reduction is also important.

9.4 Choice of nxest and nyest

The number of knots may also depend on the upper bounds nxest and nyest. Indeed, if at a certain stage in e02dcc the number of knots in one direction (say

n_{x}

) has reached the value of its upper bound (nxest), then from that moment on all subsequent knots are added in the other

(y)

direction. Therefore, you have the option of limiting the number of knots the function locates in any direction. For example, by setting

nxest = 8

(the lowest allowable value for nxest), you can indicate that you want an approximation which is a simple cubic polynomial in the variable

x

9.5 Outline of Method Used

S = 0

, the requisite number of knots is known in advance, i.e.,

n_{x} = m_{x} + 4

and

n_{y} = m_{y} + 4

; the interior knots are located immediately as

λ_{i} = x_{i - 2}

and

μ_{j} = y_{j - 2}

, for

i = 5, 6, \dots, n_{x} - 4

and

j = 5, 6, \dots, n_{y} - 4

. The corresponding least squares spline is then an interpolating spline and, therefore, a solution of the problem.

S > 0

, suitable knot sets are built up in stages (starting with no interior knots in the case of a cold start but with the knot set found in a previous call if a warm start is chosen). At each stage, a bicubic spline is fitted to the data by least squares, and

θ

, the sum of squares of residuals, is computed. If

θ > S

, new knots are added to one knot set or the other so as to reduce

θ

at the next stage. The new knots are located in intervals where the fit is particularly poor, their number depending on the value of

S

and on the progress made so far in reducing

θ

. Sooner or later, we find that

θ \leq S

and at that point the knot sets are accepted. The function then goes on to compute the (unique) spline which has these knot sets and which satisfies the full fitting criterion specified by (2) and (3). The theoretical solution has

θ = S

. The function computes the spline by an iterative scheme which is ended when

θ = S

within a relative tolerance of

0.001

. The main part of each iteration consists of a linear least squares computation of special form, done in a similarly stable and efficient manner as in e02bac for least squares curve fitting.

An exception occurs when the function finds at the start that, even with no interior knots

(n_{x} = n_{y} = 8)

, the least squares spline already has its sum of residuals

\leq S

. In this case, since this spline (which is simply a bicubic polynomial) also has an optimal value for the smoothness measure

η

, namely zero, it is returned at once as the (trivial) solution. It will usually mean that

S

has been chosen too large.

For further details of the algorithm and its use see Dierckx (1982).

9.6 Evaluation of Computed Spline

The values of the computed spline at the points

(tx (r - 1), ty (r - 1))

, for

r = 1, 2, \dots, n

, may be obtained in the array ff, of length at least

n

, by the following code:

e02dec(n, tx, ty, ff, &spline, &fail)

where spline is a structure of type Nag_2dSpline which is an output argument of e02dcc.

To evaluate the computed spline on a kx by ky rectangular grid of points in the

x

y

plane, which is defined by the

x

coordinates stored in

tx (q - 1)

, for

q = 1, 2, \dots, kx

, and the

y

coordinates stored in

ty (r - 1)

, for

r = 1, 2, \dots, ky

, returning the results in the array fg which is of length at least

kx \times ky

, the following call may be used:

e02dfc(kx, ky, tx, ty, fg, &spline, &fail)

where spline is a structure of type Nag_2dSpline which is an output argument of e02dcc. The result of the spline evaluated at grid point

(q, r)

is returned in element

[ky \times (q - 1) + r - 1]

of the array fg.

10 Example

This example program reads in values of mx, my,

x_{q}

, for

q = 1, 2, \dots, mx

, and

y_{r}

, for

r = 1, 2, \dots, my

, followed by values of the ordinates

f_{q, r}

defined at the grid points

(x_{q}, y_{r})

. It then calls e02dcc to compute a bicubic spline approximation for one specified value of s, and prints the values of the computed knots and B-spline coefficients. Finally it evaluates the spline at a small sample of points on a rectangular grid.

e02dc: FL CL CPP AD PY MB

NAG CL Interfacee02dcc (dim2_​spline_​grid)

▸▿ Contents

1 Purpose

2 Specification

3 Description

4 References

5 Arguments

6 Error Indicators and Warnings

7 Accuracy

8 Parallelism and Performance

9 Further Comments

9.1 Timing

9.2 Weighting of Data Points

9.3 Choice of s

9.4 Choice of nxest and nyest

9.5 Outline of Method Used

9.6 Evaluation of Computed Spline

10 Example

10.1 Program Text

10.2 Program Data

10.3 Program Results

NAG CL Interface
e02dcc (dim2_spline_grid)