e04xaf/e04xaa computes an approximation to the gradient vector and/or the Hessian matrix for use in conjunction with, or following the use of an optimization routine (such as e04uff/e04ufa).

e04xaa is a version of e04xaf that has additional arguments in order to make it safe for use in multithreaded applications (see Section 5).

2 Specification

2.1 Specification for e04xaf

Fortran Interface

Subroutine e04xaf (

msglvl, n, epsrf, x, mode, objfun, ldh, hforw, objf, objgrd, hcntrl, h, iwarn, work, iuser, ruser, info, ifail)

Integer, Intent (In)	::	msglvl, n, ldh
Integer, Intent (Inout)	::	mode, iuser(*), ifail
Integer, Intent (Out)	::	iwarn, info(n)
Real (Kind=nag_wp), Intent (In)	::	epsrf
Real (Kind=nag_wp), Intent (Inout)	::	x(n), hforw(n), h(ldh,), work(), ruser(*)
Real (Kind=nag_wp), Intent (Out)	::	objf, objgrd(n), hcntrl(n)
External	::	objfun

C Header Interface

#include <nag.h>

void

e04xaf_ (const Integer *msglvl, const Integer *n, const double *epsrf, double x[], Integer *mode,
void (NAG_CALL *objfun)(Integer *mode, const Integer *n, const double x[], double *objf, double objgrd[], const Integer *nstate, Integer iuser[], double ruser[]),
const Integer *ldh, double hforw[], double *objf, double objgrd[], double hcntrl[], double h[], Integer *iwarn, double work[], Integer iuser[], double ruser[], Integer info[], Integer *ifail)

2.2 Specification for e04xaa

Fortran Interface

Subroutine e04xaa (

msglvl, n, epsrf, x, mode, objfun, ldh, hforw, objf, objgrd, hcntrl, h, iwarn, work, iuser, ruser, info, lwsav, iwsav, rwsav, ifail)

Integer, Intent (In)	::	msglvl, n, ldh, iwsav(1)
Integer, Intent (Inout)	::	mode, iuser(*), ifail
Integer, Intent (Out)	::	iwarn, info(n)
Real (Kind=nag_wp), Intent (In)	::	epsrf, rwsav(1)
Real (Kind=nag_wp), Intent (Inout)	::	x(n), hforw(n), h(ldh,), work(), ruser(*)
Real (Kind=nag_wp), Intent (Out)	::	objf, objgrd(n), hcntrl(n)
Logical, Intent (In)	::	lwsav(1)
External	::	objfun

C Header Interface

#include <nag.h>

void

e04xaa_ (const Integer *msglvl, const Integer *n, const double *epsrf, double x[], Integer *mode,
void (NAG_CALL *objfun)(Integer *mode, const Integer *n, const double x[], double *objf, double objgrd[], const Integer *nstate, Integer iuser[], double ruser[]),
const Integer *ldh, double hforw[], double *objf, double objgrd[], double hcntrl[], double h[], Integer *iwarn, double work[], Integer iuser[], double ruser[], Integer info[], const logical lwsav[], const Integer iwsav[], const double rwsav[], Integer *ifail)

3 Description

e04xaf/e04xaa is similar to routine FDCALC described in Gill et al. (1983a). It should be noted that this routine aims to compute sufficiently accurate estimates of the derivatives for use with an optimization algorithm. If you require more accurate estimates you should refer to Chapter D04.

e04xaf/e04xaa computes finite difference approximations to the gradient vector and the Hessian matrix for a given function. The simplest approximation involves the forward-difference formula, in which the derivative

f^{'} (x)

of a univariate function

f (x)

is approximated by the quantity

ρ_{F} (f, h) = \frac{f (x + h) - f (x)}{h}

for some interval

h > 0

, where the subscript 'F' denotes ‘forward-difference’ (see Gill et al. (1983b)).

To summarise the procedure used by e04xaf/e04xaa (for the case when the objective function is available and you require estimates of gradient values and Hessian matrix diagonal values, i.e.,

mode = 0

) consider a univariate function

f

at the point

x

. (In order to obtain the gradient of a multivariate function

F (x)

, where

x

is an

n

-vector, the procedure is applied to each component of

x

, keeping the other components fixed.) Roughly speaking, the method is based on the fact that the bound on the relative truncation error in the forward-difference approximation tends to be an increasing function of

h

, while the relative condition error bound is generally a decreasing function of

h

, hence changes in

h

will tend to have opposite effects on these errors (see Gill et al. (1983b)).

The ‘best’ interval

h

is given by

h_{F} = 2 \sqrt{\frac{(1 + | f (x) |) e_{R}}{| Φ |}}

(1)

where

Φ

is an estimate of

f^{''} (x)

, and

e_{R}

is an estimate of the relative error associated with computing the function (see Chapter 8 of Gill et al. (1981)). Given an interval

h

Φ

is defined by the second-order approximation

Φ = \frac{f (x + h) - 2 f (x) + f (x - h)}{h^{2}} .

The decision as to whether a given value of

Φ

is acceptable involves

\hat{c} (Φ)

, the following bound on the relative condition error in

Φ

\hat{c} (Φ) = \frac{4 e_{R} (1 + | f |)}{h^{2} | Φ |}

(When

Φ

is zero,

\hat{c} (Φ)

is taken as an arbitrary large number.)

The procedure selects the interval

h_{ϕ}

(to be used in computing

Φ

) from a sequence of trial intervals

(h_{k})

. The initial trial interval is taken as

10 \bar{h}

, where

\bar{h} = 2 (1 + | x |) \sqrt{e_{R}}

unless you specify the initial value to be used.

The value of

\hat{c} (Φ)

for a trial value

h_{k}

is defined as ‘acceptable’ if it lies in the interval

[0.001, 0.1]

. In this case

h_{ϕ}

is taken as

h_{k}

, and the current value of

Φ

is used to compute

h_{F}

from (1). If

\hat{c} (Φ)

is unacceptable, the next trial interval is chosen so that the relative condition error bound will either decrease or increase, as required. If the bound on the relative condition error is too large, a larger interval is used as the next trial value in an attempt to reduce the condition error bound. On the other hand, if the relative condition error bound is too small,

h_{k}

is reduced.

The procedure will fail to produce an acceptable value of

\hat{c} (Φ)

in two situations. Firstly, if

f^{''} (x)

is extremely small, then

\hat{c} (Φ)

may never become small, even for a very large value of the interval. Alternatively,

\hat{c} (Φ)

may never exceed

0.001

, even for a very small value of the interval. This usually implies that

f^{''} (x)

is extremely large, and occurs most often near a singularity.

As a check on the validity of the estimated first derivative, the procedure provides a comparison of the forward-difference approximation computed with

h_{F}

(as above) and the central-difference approximation computed with

h_{ϕ}

. Using the central-difference formula the first derivative can be approximated by

ρ_{c} (f, h) = \frac{f (x + h) - f (x - h)}{2 h}

where

h > 0

. If the values

h_{F}

and

h_{ϕ}

do not display some agreement, neither can be considered reliable.

When both function and gradients are available and you require the Hessian matrix (i.e.,

mode = 1

) e04xaf/e04xaa follows a similar procedure to the case above with the exception that the gradient function

g (x)

is substituted for the objective function and so the forward-difference interval for the first derivative of

g (x)

with respect to variable

x_{j}

is computed. The

j

th column of the approximate Hessian matrix is then defined as in Chapter 2 of Gill et al. (1981), by

\frac{g (x + h_{j} e_{j}) - g (x)}{h_{j}}

where

h_{j}

is the best forward-difference interval associated with the

j

th component of

g

and

e_{j}

is the vector with unity in the

j

th position and zeros elsewhere.

When only the objective function is available and you require the gradients and Hessian matrix (i.e.,

mode = 2

) e04xaf/e04xaa again follows the same procedure as the case for

mode = 0

except that this time the value of

\hat{c} (Φ)

for a trial value

h_{k}

is defined as acceptable if it lies in the interval

[0.0001, 0.01]

and the initial trial interval is taken as

\bar{h} = 2 (1 + | x |) \sqrt[4]{e_{R}} .

The approximate Hessian matrix

G

is then defined as in Chapter 2 of Gill et al. (1981), by

G_{i j} (x) = \frac{1}{h_{i} h_{j}} (f (x + h_{i} e_{i} + h_{j} e_{j}) - f (x + h_{i} e_{i}) - f (x + h_{j} e_{j}) + f (x)) .

4 References

Gill P E, Murray W, Saunders M A and Wright M H (1983a) Documentation for FDCALC and FDCORE Technical Report SOL 83–6 Stanford University

Gill P E, Murray W, Saunders M A and Wright M H (1983b) Computing forward-difference intervals for numerical optimization SIAM J. Sci. Statist. Comput. 4 310–321

Gill P E, Murray W and Wright M H (1981) Practical Optimization Academic Press

5 Arguments

1: $msglvl$ – Integer Input

On entry: must indicate the amount of intermediate output desired (see Section 9.1 for a description of the printed output). All output is written on the current advisory message unit (see x04abf).

Value	Definition
0	No printout
1	A summary is printed out for each variable plus any warning messages.
Other	Values other than $0$ and $1$ should normally be used only at the direction of NAG.

2: $n$ – Integer Input

On entry: the number

n

of independent variables.

Constraint:

n \geq 1

3: $epsrf$ – Real (Kind=nag_wp) Input

On entry: must define

e_{R}

, which is intended to be a measure of the accuracy with which the problem function

F

can be computed. The value of

e_{R}

should reflect the relative precision of

1 + | F (x) |

, i.e., acts as a relative precision when

| F |

is large, and as an absolute precision when

| F |

is small. For example, if

F (x)

is typically of order

1000

and the first six significant digits are known to be correct, an appropriate value for

e_{R}

would be

1.0E−6

A discussion of epsrf is given in Chapter 8 of Gill et al. (1981). If epsrf is either too small or too large on entry a warning will be printed if

msglvl = 1

, the argument iwarn set to the appropriate value on exit and e04xaf/e04xaa will use a default value of

e_{M}^{0.9}

, where

e_{M}

is the machine precision.

epsrf \leq 0.0

on entry, then e04xaf/e04xaa will use the default value internally. The default value will be appropriate for most simple functions that are computed with full accuracy.

4: $x (n)$ – Real (Kind=nag_wp) array Input

On entry: the point

x

at which the derivatives are to be computed.

5: $mode$ – Integer Input/Output

On entry: indicates which derivatives are required.

$mode = 0$: The gradient and Hessian diagonal values having supplied the objective function via objfun.
$mode = 1$: The Hessian matrix having supplied both the objective function and gradients via objfun.
$mode = 2$: The gradient values and Hessian matrix having supplied the objective function via objfun.

On exit: is changed only if you set mode negative in objfun, i.e., you have requested termination of e04xaf/e04xaa.

6: $objfun$ – Subroutine, supplied by the user. External Procedure

mode = 0

2

, objfun must calculate the objective function; otherwise, if

mode = 1

, objfun must calculate the objective function and the gradients.

The specification of objfun is:

Fortran Interface

Subroutine objfun (

mode, n, x, objf, objgrd, nstate, iuser, ruser)

Integer, Intent (In)	::	n, nstate
Integer, Intent (Inout)	::	mode, iuser(*)
Real (Kind=nag_wp), Intent (In)	::	x(n)
Real (Kind=nag_wp), Intent (Inout)	::	ruser(*)
Real (Kind=nag_wp), Intent (Out)	::	objf, objgrd(n)

C Header Interface

void	objfun (Integer mode, const Integer n, const double x[], double objf, double objgrd[], const Integer nstate, Integer iuser[], double ruser[])

1: $mode$ – Integer Input/Output: mode indicates which argument values within objfun need to be set.

On entry: to objfun, mode is always set to the value that you set it to before the call to e04xaf/e04xaa.

On exit: its value must not be altered unless you wish to indicate a failure within objfun, in which case it should be set to a negative value. If mode is negative on exit from objfun, the execution of e04xaf/e04xaa is terminated with ifail set to mode.
2: $n$ – Integer Input: On entry: the number $n$ of variables as input to e04xaf/e04xaa.
3: $x (n)$ – Real (Kind=nag_wp) array Input: On entry: the point $x$ at which the objective function (and gradients if $mode = 1$ ) is to be evaluated.
4: $objf$ – Real (Kind=nag_wp) Output: On exit: must be set to the value of the objective function.
5: $objgrd (n)$ – Real (Kind=nag_wp) array Output: On exit: if $mode = 1$ , $objgrd (j)$ must contain the value of the first derivative with respect to $x$ .
If $mode \neq 1$ , objgrd need not be set.
6: $nstate$ – Integer Input: On entry: will be set to $1$ on the first call of objfun by e04xaf/e04xaa, and is $0$ for all subsequent calls. Thus, if you wish, nstate may be tested within objfun in order to perform certain calculations once only. For example you may read data.
7: $iuser (*)$ – Integer array User Workspace
8: $ruser (*)$ – Real (Kind=nag_wp) array User Workspace: objfun is called with the arguments iuser and ruser as supplied to e04xaf/e04xaa. You should use the arrays iuser and ruser to supply information to objfun.

objfun must either be a module subprogram USEd by, or declared as EXTERNAL in, the (sub)program from which e04xaf/e04xaa is called. Arguments denoted as Input must not be changed by this procedure.

Note: objfun should not return floating-point NaN (Not a Number) or infinity values, since these are not handled by e04xaf/e04xaa. If your code inadvertently does return any NaNs or infinities, e04xaf/e04xaa is likely to produce unexpected results.

7: $ldh$ – Integer Input

On entry: the first dimension of the array h as declared in the (sub)program from which e04xaf/e04xaa is called.

Constraint:

ldh \geq n

8: $hforw (n)$ – Real (Kind=nag_wp) array Input/Output

On entry: the initial trial interval for computing the appropriate partial derivative to the

j

th variable.

hforw (j) \leq 0.0

, the initial trial interval is computed by e04xaf/e04xaa (see Section 3).

On exit:

hforw (j)

is the best interval found for computing a forward-difference approximation to the appropriate partial derivative for the

j

th variable.

9: $objf$ – Real (Kind=nag_wp) Output

On exit: the value of the objective function evaluated at the input vector in x.

10: $objgrd (n)$ – Real (Kind=nag_wp) array Output

On exit: if

mode = 0

2

objgrd (j)

contains the best estimate of the first partial derivative for the

j

th variable.

mode = 1

objgrd (j)

contains the first partial derivative for the

j

th variable evaluated at the input vector in x.

11: $hcntrl (n)$ – Real (Kind=nag_wp) array Output

On exit:

hcntrl (j)

is the best interval found for computing a central-difference approximation to the appropriate partial derivative for the

j

th variable.

12: $h (ldh, *)$ – Real (Kind=nag_wp) array Output

Note: the second dimension of the array h must be at least

1

mode = 0

and at least

n

mode = 1

2

On exit: if

mode = 0

, the estimated Hessian diagonal elements are contained in the first column of this array.

mode = 1

2

, the estimated Hessian matrix is contained in the leading

n \times n

part of this array.

13: $iwarn$ – Integer Output

On exit:

iwarn = 0

on successful exit.

If the value of epsrf on entry is too small or too large then iwarn is set to

1

2

respectively on exit and the default value for epsrf is used within e04xaf/e04xaa.

msglvl > 0

then warnings will be printed if epsrf is too small or too large.

14: $work (*)$ – Real (Kind=nag_wp) array Workspace

Note: the dimension of the array work must be at least

n

mode = 0

and at least

n \times (n + 1)

mode = 1

2

15: $iuser (*)$ – Integer array User Workspace

16: $ruser (*)$ – Real (Kind=nag_wp) array User Workspace

iuser and ruser are not used by e04xaf/e04xaa, but are passed directly to objfun and may be used to pass information to this routine.

17: $info (n)$ – Integer array Output

On exit:

info (j)

represents diagnostic information on variable

j

as follows:

$info (j) = 1$: The appropriate function appears to be constant. $hforw (i)$ is set to the initial trial interval value (see Section 3) corresponding to a well-scaled problem and Error est. in the printed output is set to zero. This value occurs when the estimated relative condition error in the first derivative approximation is unacceptably large for every value of the finite difference interval. If this happens when the function is not constant the initial interval may be too small; in this case, it may be worthwhile to rerun e04xaf/e04xaa with larger initial trial interval values supplied in hforw (see Section 3). This error may also occur if the function evaluation includes an inordinately large constant term or if epsrf is too large.
$info (j) = 2$: The appropriate function appears to be linear or odd. $hforw (i)$ is set to the smallest interval with acceptable bounds on the relative condition error in the forward- and backward-difference estimates. In this case, the estimated relative condition error in the second derivative approximation remained large for every trial interval, but the estimated error in the first derivative approximation was acceptable for at least one interval. If the function is not linear or odd the relative condition error in the second derivative may be decreasing very slowly, it may be worthwhile to rerun e04xaf/e04xaa with larger initial trial interval values supplied in hforw (see Section 3).
$info (j) = 3$: The second derivative of the appropriate function appears to be so large that it cannot be reliably estimated (i.e., near a singularity). $hforw (i)$ is set to the smallest trial interval.
This value occurs when the relative condition error estimate in the second derivative remained very small for every trial interval.

If the second derivative is not large the relative condition error in the second derivative may be increasing very slowly. It may be worthwhile to rerun e04xaf/e04xaa with smaller initial trial interval values supplied in hforw (see Section 3). This error may also occur when the given value of epsrf is not a good estimate of a bound on the absolute error in the appropriate function (i.e., epsrf is too small).
$info (j) = 4$: The algorithm terminated with an apparently acceptable estimate of the second derivative. However the forward-difference estimates of the appropriate first derivatives (computed with the final estimate of the ‘optimal’ forward-difference interval) and the central difference estimates (computed with the interval used to compute the final estimate of the second derivative) do not agree to half a decimal place. The usual reason that the forward- and central-difference estimates fail to agree is that the first derivative is small.
If the first derivative is not small, it may be helpful to execute the procedure at a different point.

18: $ifail$ – Integer Input/Output

Note: for e04xaa, ifail does not occur in this position in the argument list. See the additional arguments described below.

On entry: ifail must be set to

0

−1

1

to set behaviour on detection of an error; these values have no effect when no error is detected.

A value of

0

causes the printing of an error message and program execution will be halted; otherwise program execution continues. A value of

−1

means that an error message is printed while a value of

1

means that it is not.

If halting is not appropriate, the value

−1

1

is recommended. If message printing is undesirable, then the value

1

is recommended. Otherwise, the value

0

is recommended. When the value $- 1$ or $1$ is used it is essential to test the value of ifail on exit.

On exit:

ifail = 0

unless the routine detects an error or a warning has been flagged (see Section 6).

Note: the following are additional arguments for specific use with e04xaa. Users of e04xaf therefore need not read the remainder of this description.

18: $lwsav (1)$ – Logical array Communication Array

19: $iwsav (1)$ – Integer array Communication Array

20: $rwsav (1)$ – Real (Kind=nag_wp) array Communication Array

These arguments are no longer required by e04xaf/e04xaa.

21: $ifail$ – Integer Input/Output

Note: see the argument description for ifail above.

6 Error Indicators and Warnings

On exit from e04xaf/e04xaa both diagnostic arguments info and ifail should be tested. ifail represents an overall diagnostic indicator, whereas the integer array info represents diagnostic information on each variable.

If on entry

ifail = 0

−1

, explanatory error messages are output on the current error message unit (as defined by x04aaf).

Errors or warnings detected by the routine:

$ifail = 1$: On entry, $ldh = ⟨ value ⟩$ and $n = ⟨ value ⟩$ .
Constraint: $ldh \geq n$ .

On entry, $mode = ⟨ value ⟩$ .
Constraint: $0 \leq mode \leq 2$ .

On entry, $n = ⟨ value ⟩$ .
Constraint: $n \geq 1$ .

$ifail = 2$: One or more variables have a nonzero info value.
This may not necessarily represent an unsuccessful exit – see diagnostic information on info.

$ifail < 0$: User requested termination by setting mode negative in objfun.

$ifail = - 99$: An unexpected error has been triggered by this routine. Please contact NAG.
See Section 7 in the Introduction to the NAG Library FL Interface for further information.

$ifail = - 399$: Your licence key may have expired or may not have been installed correctly.
See Section 8 in the Introduction to the NAG Library FL Interface for further information.

$ifail = - 999$: Dynamic memory allocation failed.
See Section 9 in the Introduction to the NAG Library FL Interface for further information.

7 Accuracy

ifail = 0

on exit the algorithm terminated successfully, i.e., the forward-difference estimates of the appropriate first derivatives (computed with the final estimate of the ‘optimal’ forward-difference interval

h_{F}

) and the central-difference estimates (computed with the interval

h_{ϕ}

used to compute the final estimate of the second derivative) agree to at least half a decimal place.

In short word length implementations when computing the full Hessian matrix given function values only (i.e.,

mode = 2

) the elements of the computed Hessian will have at best

1

2

figures of accuracy.

8 Parallelism and Performance

e04xaf/e04xaa is not threaded in any implementation.

9 Further Comments

To evaluate an acceptable set of finite difference intervals for a well-scaled problem, the routine will require around two function evaluations per variable; in a badly scaled problem however, as many as six function evaluations per variable may be needed.

If you request the full Hessian matrix supplying both function and gradients (i.e.,

mode = 1

) or function only (i.e.,

mode = 2

) then a further n or

3 \times n \times (n + 1) / 2

function evaluations respectively are required.

9.1 Description of the Printed Output

The following is a description of the printed output from e04xaf/e04xaa as controlled by the argument msglvl.

Output when

msglvl = 1

is as follows:

J	number of variable for which the difference interval has been computed.
$X (j)$	$j$ th variable of $x$ as set by you.
F. dif. int.	the best interval found for computing a forward-difference approximation to the appropriate partial derivative with respect to the $j$ th variable.
C. dif. int.	the best interval found for computing a central-difference approximation to the appropriate partial derivative with respect to the $j$ th variable.
Error est.	a bound on the estimated error in the final forward-difference approximation. When $info (j) = 1$ , Error est. is set to zero.
Grad. est.	best estimate of the first partial derivative with respect to the $j$ th variable.
Hess diag est.	best estimate of the second partial derivative with respect to the $j$ th variable.
fun evals.	the number of function evaluations used to compute the final difference intervals for the $j$ th variable.
$info (j)$	the value of info for the $j$ th variable.