e02caf forms an approximation to the weighted, least squares Chebyshev series surface fit to data arbitrarily distributed on lines parallel to one independent coordinate axis.

2 Specification

Fortran Interface

Subroutine e02caf (

m, n, k, l, x, y, f, w, mtot, a, na, xmin, xmax, nux, inuxp1, nuy, inuyp1, work, nwork, ifail)

Integer, Intent (In)	::	m(n), n, k, l, mtot, na, inuxp1, inuyp1, nwork
Integer, Intent (Inout)	::	ifail
Real (Kind=nag_wp), Intent (In)	::	x(mtot), y(n), f(mtot), w(mtot), xmin(n), xmax(n), nux(inuxp1), nuy(inuyp1)
Real (Kind=nag_wp), Intent (Out)	::	a(na), work(nwork)

C Header Interface

#include <nag.h>

void

e02caf_ (const Integer m[], const Integer *n, const Integer *k, const Integer *l, const double x[], const double y[], const double f[], const double w[], const Integer *mtot, double a[], const Integer *na, const double xmin[], const double xmax[], const double nux[], const Integer *inuxp1, const double nuy[], const Integer *inuyp1, double work[], const Integer *nwork, Integer *ifail)

The routine may be called by the names e02caf or nagf_fit_dim2_cheb_lines.

3 Description

e02caf determines a bivariate polynomial approximation of degree

k

x

and

l

y

to the set of data points

(x_{r, s}, y_{s}, f_{r, s})

, with weights

w_{r, s}

, for

s = 1, 2, \dots, n

and

r = 1, 2, \dots, m_{s}

. That is, the data points are on lines

y = y_{s}

, but the

x

values may be different on each line. The values of

k

and

l

are prescribed by you (for guidance on their choice, see Section 9). The subroutine is based on the method described in Sections 5 and 6 of Clenshaw and Hayes (1965).

The polynomial is represented in double Chebyshev series form with arguments

\bar{x}

and

\bar{y}

. The arguments lie in the range

−1

+ 1

and are related to the original variables

x

and

y

by the transformations

\bar{x} = \frac{2 x - (x_{\max} + x_{\min})}{(x_{\max} - x_{\min})} and \bar{y} = \frac{2 y - (y_{\max} + y_{\min})}{(y_{\max} - y_{\min})} .

Here

y_{\max}

and

y_{\min}

are set by the subroutine to, respectively, the largest and smallest value of

y_{s}

, but

x_{\max}

and

x_{\min}

are functions of

y

prescribed by you (see Section 9). For this subroutine, only their values

x_{\max}^{(s)}

and

x_{\min}^{(s)}

at each

y = y_{s}

are required. For each

s = 1, 2, \dots, n

x_{\max}^{(s)}

must not be less than the largest

x_{r, s}

on the line

y = y_{s}

, and, similarly,

x_{\min}^{(s)}

must not be greater than the smallest

x_{r, s}

The double Chebyshev series can be written as

\sum_{i = 0}^{k} \sum_{j = 0}^{l} a_{i j} T_{i} (\bar{x}) T_{j} (\bar{y})

where

T_{i} (\bar{x})

is the Chebyshev polynomial of the first kind of degree

i

with argument

\bar{x}

, and

T_{j} (y)

is similarly defined. However, the standard convention, followed in this subroutine, is that coefficients in the above expression which have either

i

j

zero are written as

\frac{1}{2} a_{i j}

, instead of simply

a_{i j}

, and the coefficient with both

i

and

j

equal to zero is written as

\frac{1}{4} a_{0, 0}

. The series with coefficients output by the subroutine should be summed using this convention. e02cbf is available to compute values of the fitted function from these coefficients.

The subroutine first obtains Chebyshev series coefficients

c_{s, i}

, for

i = 0, 1, \dots, k

, of the weighted least squares polynomial curve fit of degree

k

\bar{x}

to the data on each line

y = y_{s}

, for

s = 1, 2, \dots, n

, in turn, using an auxiliary subroutine. The same subroutine is then called

k + 1

times to fit

c_{s, i}

, for

s = 1, 2, \dots, n

, by a polynomial of degree

l

\bar{y}

, for each

i = 0, 1, \dots, k

. The resulting coefficients are the required

a_{i j}

You can force the fit to contain a given polynomial factor. This allows for the surface fit to be constrained to have specified values and derivatives along the boundaries

x = x_{\min}

x = x_{\max}

y = y_{\min}

and

y = y_{\max}

or indeed along any lines

\bar{x} =

constant or

\bar{y} =

constant (see Section 8 of Clenshaw and Hayes (1965)).

4 References

Clenshaw C W and Hayes J G (1965) Curve and surface fitting J. Inst. Math. Appl. 1 164–183

Hayes J G (ed.) (1970) Numerical Approximation to Functions and Data Athlone Press, London

5 Arguments

1: $m (n)$ – Integer array Input

On entry:

m (s)

must be set to

m_{s}

, the number of data

x

values on the line

y = y_{s}

, for

s = 1, 2, \dots, n

Constraint:

m (s) > 0

, for

s = 1, 2, \dots, n

2: $n$ – Integer Input

On entry: the number of lines

y =

constant on which data points are given.

Constraint:

n > 0

3: $k$ – Integer Input

On entry:

k

, the required degree of

x

in the fit.

Constraint: for

s = 1, 2, \dots, n

inuxp1 - 1 \leq k < mdist (s) + inuxp1 - 1

, where

mdist (s)

is the number of distinct

x

values with nonzero weight on the line

y = y_{s}

. See Section 9.

4: $l$ – Integer Input

On entry:

l

, the required degree of

y

in the fit.

Constraints:

$l \geq 0$ ;
$inuyp1 - 1 \leq l < n + inuyp1 - 1$ .

5: $x (mtot)$ – Real (Kind=nag_wp) array Input

On entry: the

x

values of the data points. The sequence must be

all points on $y = y_{1}$ , followed by
all points on $y = y_{2}$ , followed by
$⋮$
all points on $y = y_{n}$ .

Constraint: for each

y_{s}

, the

x

values must be in nondecreasing order.

6: $y (n)$ – Real (Kind=nag_wp) array Input

On entry:

y (s)

must contain the

y

value of line

y = y_{s}

, for

s = 1, 2, \dots, n

, on which data is given.

Constraint: the

y_{s}

values must be in strictly increasing order.

7: $f (mtot)$ – Real (Kind=nag_wp) array Input

On entry:

f

, the data values of the dependent variable in the same sequence as the

x

values.

8: $w (mtot)$ – Real (Kind=nag_wp) array Input

On entry: the weights to be assigned to the data points, in the same sequence as the

x

values. These weights should be calculated from estimates of the absolute accuracies of the

f_{r}

, expressed as standard deviations, probable errors or some other measure which is of the same dimensions as

f_{r}

. Specifically, each

w_{r}

should be inversely proportional to the accuracy estimate of

f_{r}

. Often weights all equal to unity will be satisfactory. If a particular weight is zero, the corresponding data point is omitted from the fit.

9: $mtot$ – Integer Input

On entry: the dimension of the arrays x, f and w as declared in the (sub)program from which e02caf is called.

Constraint:

mtot \geq \sum_{s = 1}^{n} m (s)

10: $a (na)$ – Real (Kind=nag_wp) array Output

On exit: contains the Chebyshev coefficients of the fit.

a (i \times (l + 1) + j)

is the coefficient

a_{i j}

of Section 3 defined according to the standard convention. These coefficients are used by e02cbf to calculate values of the fitted function.

11: $na$ – Integer Input

On entry: the dimension of the array a as declared in the (sub)program from which e02caf is called.

Constraint:

na \geq (k + 1) \times (l + 1)

, the total number of coefficients in the fit.

12: $xmin (n)$ – Real (Kind=nag_wp) array Input

On entry:

xmin (s)

must contain

x_{\min}^{(s)}

, the lower end of the range of

x

on the line

y = y_{s}

, for

s = 1, 2, \dots, n

. It must not be greater than the lowest data value of

x

on the line. Each

x_{\min}^{(s)}

is scaled to

- 1.0

in the fit. (See also Section 9.)

13: $xmax (n)$ – Real (Kind=nag_wp) array Input

On entry:

xmax (s)

must contain

x_{\max}^{(s)}

, the upper end of the range of

x

on the line

y = y_{s}

, for

s = 1, 2, \dots, n

. It must not be less than the highest data value of

x

on the line. Each

x_{\max}^{(s)}

is scaled to

+ 1.0

in the fit. (See also Section 9.)

Constraint:

xmax (s) > xmin (s)

14: $nux (inuxp1)$ – Real (Kind=nag_wp) array Input

On entry:

nux (i)

must contain the coefficient of the Chebyshev polynomial of degree

(i - 1)

\bar{x}

, in the Chebyshev series representation of the polynomial factor in

\bar{x}

which you require the fit to contain, for

i = 1, 2, \dots, inuxp1

. These coefficients are defined according to the standard convention of Section 3.

Constraint:

nux (inuxp1)

must be nonzero, unless

inuxp1 = 1

, in which case nux is ignored.

15: $inuxp1$ – Integer Input

On entry:

INUX + 1

, where

INUX

is the degree of a polynomial factor in

\bar{x}

which you require the fit to contain. (See Section 3, last paragraph.)

If this option is not required, inuxp1 should be set equal to

1

Constraint:

1 \leq inuxp1 \leq k + 1

16: $nuy (inuyp1)$ – Real (Kind=nag_wp) array Input

On entry:

nuy (i)

must contain the coefficient of the Chebyshev polynomial of degree

(i - 1)

\bar{y}

, in the Chebyshev series representation of the polynomial factor which you require the fit to contain, for

i = 1, 2, \dots, inuyp1

. These coefficients are defined according to the standard convention of Section 3.

Constraint:

nuy (inuyp1)

must be nonzero, unless

inuyp1 = 1

, in which case nuy is ignored.

17: $inuyp1$ – Integer Input

On entry:

INUY + 1

, where

INUY

is the degree of a polynomial factor in

\bar{y}

which you require the fit to contain. (See Section 3, last paragraph.) If this option is not required, inuyp1 should be set equal to

1

18: $work (nwork)$ – Real (Kind=nag_wp) array Workspace

19: $nwork$ – Integer Input

On entry: the dimension of the array work as declared in the (sub)program from which e02caf is called.

Constraint:

nwork \geq 3 \times mtot + 2 \times n \times (k + 2) + 5 \times (1 + \max (k, l))

20: $ifail$ – Integer Input/Output

On entry: ifail must be set to

0

−1

1

to set behaviour on detection of an error; these values have no effect when no error is detected.

A value of

0

causes the printing of an error message and program execution will be halted; otherwise program execution continues. A value of

−1

means that an error message is printed while a value of

1

means that it is not.

If halting is not appropriate, the value

−1

1

is recommended. If message printing is undesirable, then the value

1

is recommended. Otherwise, the value

0

is recommended. When the value $- 1$ or $1$ is used it is essential to test the value of ifail on exit.

On exit:

ifail = 0

unless the routine detects an error or a warning has been flagged (see Section 6).

6 Error Indicators and Warnings

If on entry

ifail = 0

−1

, explanatory error messages are output on the current error message unit (as defined by x04aaf).

Errors or warnings detected by the routine:

$ifail = 1$: On entry, $I = ⟨ value ⟩$ , $m (I) = ⟨ value ⟩$ , $k = ⟨ value ⟩$ and $inuxp1 = ⟨ value ⟩$ .
Constraint: $m (I) \geq k - inuxp1 + 2$ .

On entry, $inuxp1 = ⟨ value ⟩$ .
Constraint: $inuxp1 \geq 1$ .

On entry, $inuxp1 = ⟨ value ⟩$ and $k = ⟨ value ⟩$ .
Constraint: $inuxp1 \leq k + 1$ .

On entry, $inuyp1 = ⟨ value ⟩$ .
Constraint: $inuyp1 \geq 1$ .

On entry, $inuyp1 = ⟨ value ⟩$ and $l = ⟨ value ⟩$ .
Constraint: $inuyp1 \leq l + 1$ .

On entry, $k = ⟨ value ⟩$ .
Constraint: $k \geq 0$ .

On entry, $l = ⟨ value ⟩$ .
Constraint: $l \geq 0$ .

On entry, mtot is too small. $mtot = ⟨ value ⟩$ . Minimum possible dimension: $⟨ value ⟩$ .

On entry, $n = ⟨ value ⟩$ , $l = ⟨ value ⟩$ and $inuyp1 = ⟨ value ⟩$ .
Constraint: $n \geq l - inuyp1 + 2$ .

On entry, na is too small. $na = ⟨ value ⟩$ . Minimum possible dimension: $⟨ value ⟩$ .

On entry, nwork is too small. $nwork = ⟨ value ⟩$ . Minimum possible dimension: $⟨ value ⟩$ .

$ifail = 2$: On entry, $xmin (I)$ and $xmax (I)$ do not span the data x values on $y = y (I)$ : $I = ⟨ value ⟩$ , $xmin (I) = ⟨ value ⟩$ , $xmax (I) = ⟨ value ⟩$ and $y (I) = ⟨ value ⟩$ .

$ifail = 3$: On entry, $I = ⟨ value ⟩$ , $y (I) = ⟨ value ⟩$ and $y (I - 1) = ⟨ value ⟩$ .
Constraint: $y (I) > y (I - 1)$ .

On entry, the data x values are not nondecreasing for $y = y (I)$ : $I = ⟨ value ⟩$ and $y (I) = ⟨ value ⟩$ .

$ifail = 4$: On entry, the number of distinct x values with nonzero weight on $y = y (I)$ is less than $k - inuxp1 + 2$ : $I = ⟨ value ⟩$ , $y (I) = ⟨ value ⟩$ , $k = ⟨ value ⟩$ and $inuxp1 = ⟨ value ⟩$ .

$ifail = 5$: On entry, $inuxp1 = ⟨ value ⟩$ , $nux (inuxp1) = ⟨ value ⟩$ , $inuyp1 = ⟨ value ⟩$ and $nuy (inuyp1) = ⟨ value ⟩$ .
Constraint: if $nux (inuxp1) = 0.0$ , $inuxp1 = 1$ ; if $nuy (inuyp1) = 0.0$ , $inuyp1 = 1$ .

$ifail = - 99$: An unexpected error has been triggered by this routine. Please contact NAG.
See Section 7 in the Introduction to the NAG Library FL Interface for further information.

$ifail = - 399$: Your licence key may have expired or may not have been installed correctly.
See Section 8 in the Introduction to the NAG Library FL Interface for further information.

$ifail = - 999$: Dynamic memory allocation failed.
See Section 9 in the Introduction to the NAG Library FL Interface for further information.

7 Accuracy

No error analysis for this method has been published. Practical experience with the method, however, is generally extremely satisfactory.

8 Parallelism and Performance

Background information to multithreading can be found in the Multithreading documentation.

e02caf is threaded by NAG for parallel execution in multithreaded implementations of the NAG Library.

Please consult the X06 Chapter Introduction for information on how to control and interrogate the OpenMP environment used within this routine. Please also consult the Users' Note for your implementation for any additional implementation-specific information.

9 Further Comments

The time taken is approximately proportional to

k \times (k \times mtot + n \times l^{2})

The reason for allowing

x_{\max}

and

x_{\min}

(which are used to normalize the range of

x

) to vary with

y

is that unsatisfactory fits can result if the highest (or lowest) data values of the normalized

x

on each line

y = y_{s}

are not approximately the same. (For an explanation of this phenomenon, see page 176 of Clenshaw and Hayes (1965).) Commonly in practice, the lowest (for example) data values

x_{1, s}

, while not being approximately constant, do lie close to some smooth curve in the

(x, y)

plane. Using values from this curve as the values of

x_{\min}

, different in general on each line, causes the lowest transformed data values

{\bar{x}}_{1, s}

to be approximately constant. Sometimes, appropriate curves for

x_{\max}

and

x_{\min}

will be clear from the context of the problem (they need not be polynomials). If this is not the case, suitable curves can often be obtained by fitting to the lowest data values

x_{1, s}

and to the corresponding highest data values of

x

, low degree polynomials in

y

, using routine e02adf, and then shifting the two curves outwards by a small amount so that they just contain all the data between them. The complete curves are not in fact supplied to the present subroutine, only their values at each

y_{s}

; and the values simply need to lie on smooth curves. More values on the complete curves will be required subsequently, when computing values of the fitted surface at arbitrary

y

values.

Naturally, a satisfactory approximation to the surface underlying the data cannot be expected if the character of the surface is not adequately represented by the data. Also, as always with polynomials, the approximating function may exhibit unwanted oscillations (particularly near the ends of the ranges) if the degrees

k

and

l

are taken greater than certain values, generally unknown but depending on the total number of coefficients

(k + 1) \times (l + 1)

should be significantly smaller than, say not more than half, the total number of data points. Similarly,

k + 1

should be significantly smaller than most (preferably all) the

m_{s}

, and

l + 1

significantly smaller than

n

. Closer spacing of the data near the ends of the

x

and

y

ranges is an advantage. In particular, if

{\bar{y}}_{s} = - \cos (π (s - 1) / (n - 1))

, for

s = 1, 2, \dots, n

and

{\bar{x}}_{r, s} = - \cos (π (r - 1) / (m - 1))

, for

r = 1, 2, \dots, m

, (thus

m_{s} = m

for all

s

), then the values

k = m - 1

and

l = n - 1

(so that the polynomial passes exactly through all the data points) should not give unwanted oscillations. Other datasets should be similarly satisfactory if they are everywhere at least as closely spaced as the above cosine values with

m

replaced by

k + 1

and

n \times l + 1

(more precisely, if for every

s

the largest interval between consecutive values of

\arccos {\bar{x}}_{r, s}

, for

r = 1, 2, \dots, m

, is not greater than

π / k

, and similarly for the

{\bar{y}}_{s}

). The polynomial obtained should always be examined graphically before acceptance. Note that, for this purpose it is not sufficient to plot the polynomial only at the data values of

x

and

y

: intermediate values should also be plotted, preferably via a graphics facility.

Provided the data are adequate, and the surface underlying the data is of a form that can be represented by a polynomial of the chosen degrees, the subroutine should produce a good approximation to this surface. It is not, however, the true least squares surface fit nor even a polynomial in

x

and

y

, the original variables (see Section 6 of Clenshaw and Hayes (1965), ), except in certain special cases. The most important of these is where the data values of

x

are the same on each line

y = y_{s}

, (i.e., the data points lie on a rectangular mesh in the

(x, y)

plane), the weights of the data points are all equal, and

x_{\max}

and

x_{\min}

are both constants (in this case they should be set to the largest and smallest data values of

x

, respectively).

If the dataset is such that it can be satisfactorily approximated by a polynomial of degrees

k^{'}

and

l^{'}

, say, then if higher values are used for

k

and

l

in the subroutine, all the coefficients

a_{i j}

for

i > k^{'}

j > l^{'}

will take apparently random values within a range bounded by the size of the data errors, or rather less. (This behaviour of the Chebyshev coefficients, most readily observed if they are set out in a rectangular array, closely parallels that in curve-fitting, examples of which are given in Section 8 of Hayes (1970).) In practice, therefore, to establish suitable values of

k^{'}

and

l^{'}

, you should first be seeking (within the limitations discussed above) values for

k

and

l

which are large enough to exhibit the behaviour described. Values for

k^{'}

and

l^{'}

should then be chosen as the smallest which do not exclude any coefficients significantly larger than the random ones. A polynomial of degrees

k^{'}

and

l^{'}

should then be fitted to the data.

If the option to force the fit to contain a given polynomial factor in

x

is used and if zeros of the chosen factor coincide with data

x

values on any line, then the effective number of data points on that line is reduced by the number of such coincidences. A similar consideration applies when forcing the

y

-direction. No account is taken of this by the subroutine when testing that the degrees

k

and

l

have not been chosen too large.

10 Example

This example reads data in the following order, using the notation of the argument list for e02caf above:

\begin{array}{l} n k l \\ y (i) m (i) xmin (i) xmax (i), & for ​ i = 1, 2, \dots, n \\ x (i) f (i) w (i), & for ​ i = 1, 2, \dots, mtot . \end{array}

The data points are fitted using e02caf, and then the fitting polynomial is evaluated at the data points using e02cbf.

The output is:

the data points and their fitted values;
the Chebyshev coefficients of the fit.

e02ca: FL CL CPP AD PY MB

NAG FL Interfacee02caf (dim2_​cheb_​lines)

▸▿ Contents

1 Purpose

2 Specification

3 Description

4 References

5 Arguments

6 Error Indicators and Warnings

7 Accuracy

8 Parallelism and Performance

9 Further Comments

10 Example

10.1 Program Text

10.2 Program Data

10.3 Program Results

NAG FL Interface
e02caf (dim2_cheb_lines)