e04xaf/e04xaa : NAG Library, Mark 26

e04xaf/e04xaa computes an approximation to the gradient vector and/or the Hessian matrix for use in conjunction with, or following the use of an optimization routine (such as e04uff/e04ufa).

e04xaa is a version of e04xaf that has additional arguments in order to make it safe for use in multithreaded applications (see Section 5).

Fortran Interface

Subroutine e04xaf (

msglvl, n, epsrf, x, mode, objfun, ldh, hforw, objf, objgrd, hcntrl, h, iwarn, work, iuser, ruser, info, ifail)

Integer, Intent (In)	::	msglvl, n, ldh
Integer, Intent (Inout)	::	mode, iuser(*), ifail
Integer, Intent (Out)	::	iwarn, info(n)
Real (Kind=nag_wp), Intent (In)	::	epsrf
Real (Kind=nag_wp), Intent (Inout)	::	x(n), hforw(n), h(ldh,), work(), ruser(*)
Real (Kind=nag_wp), Intent (Out)	::	objf, objgrd(n), hcntrl(n)
External	::	objfun

C Header Interface

#include nagmk26.h

void

e04xaf_ ( const Integer *msglvl, const Integer *n, const double *epsrf, double x[], Integer *mode,
void (NAG_CALL *objfun)( Integer *mode, const Integer *n, const double x[], double *objf, double objgrd[], const Integer *nstate, Integer iuser[], double ruser[]),
const Integer *ldh, double hforw[], double *objf, double objgrd[], double hcntrl[], double h[], Integer *iwarn, double work[], Integer iuser[], double ruser[], Integer info[], Integer *ifail)

Fortran Interface

Subroutine e04xaa (

msglvl, n, epsrf, x, mode, objfun, ldh, hforw, objf, objgrd, hcntrl, h, iwarn, work, iuser, ruser, info, lwsav, iwsav, rwsav, ifail)

Integer, Intent (In)	::	msglvl, n, ldh, iwsav(1)
Integer, Intent (Inout)	::	mode, iuser(*), ifail
Integer, Intent (Out)	::	iwarn, info(n)
Real (Kind=nag_wp), Intent (In)	::	epsrf, rwsav(1)
Real (Kind=nag_wp), Intent (Inout)	::	x(n), hforw(n), h(ldh,), work(), ruser(*)
Real (Kind=nag_wp), Intent (Out)	::	objf, objgrd(n), hcntrl(n)
Logical, Intent (In)	::	lwsav(1)
External	::	objfun

C Header Interface

#include nagmk26.h

void

e04xaa_ ( const Integer *msglvl, const Integer *n, const double *epsrf, double x[], Integer *mode,
void (NAG_CALL *objfun)( Integer *mode, const Integer *n, const double x[], double *objf, double objgrd[], const Integer *nstate, Integer iuser[], double ruser[]),
const Integer *ldh, double hforw[], double *objf, double objgrd[], double hcntrl[], double h[], Integer *iwarn, double work[], Integer iuser[], double ruser[], Integer info[], const logical lwsav[], const Integer iwsav[], const double rwsav[], Integer *ifail)

e04xaf/e04xaa is similar to routine FDCALC described in Gill et al. (1983a). It should be noted that this routine aims to compute sufficiently accurate estimates of the derivatives for use with an optimization algorithm. If you require more accurate estimates you should refer to Chapter D04.

e04xaf/e04xaa computes finite difference approximations to the gradient vector and the Hessian matrix for a given function. The simplest approximation involves the forward-difference formula, in which the derivative

f^{'} (x)

of a univariate function

f (x)

is approximated by the quantity

ρ_{F} (f, h) = \frac{f (x + h) - f (x)}{h}

for some interval

h > 0

, where the subscript 'F' denotes ‘forward-difference’ (see Gill et al. (1983b)).

To summarise the procedure used by e04xaf/e04xaa (for the case when the objective function is available and you require estimates of gradient values and Hessian matrix diagonal values, i.e.,

mode = 0

) consider a univariate function

f

at the point

x

. (In order to obtain the gradient of a multivariate function

F (x)

, where

x

is an

n

-vector, the procedure is applied to each component of

x

, keeping the other components fixed.) Roughly speaking, the method is based on the fact that the bound on the relative truncation error in the forward-difference approximation tends to be an increasing function of

h

, while the relative condition error bound is generally a decreasing function of

h

, hence changes in

h

will tend to have opposite effects on these errors (see Gill et al. (1983b)).

The ‘best’ interval

h

is given by

h_{F} = 2 \sqrt{\frac{(1 + |f (x)|) e_{R}}{|Φ|}}

(1)

where

Φ

is an estimate of

f^{''} (x)

, and

e_{R}

is an estimate of the relative error associated with computing the function (see Chapter 8 of Gill et al. (1981)). Given an interval

h

,

Φ

is defined by the second-order approximation

Φ = \frac{f (x + h) - 2 f (x) + f (x - h)}{h^{2}} .

The decision as to whether a given value of

Φ

is acceptable involves

\hat{c} (Φ)

, the following bound on the relative condition error in

Φ

:

\hat{c} (Φ) = \frac{4 e_{R} (1 + |f|)}{h^{2} |Φ|}

(When

Φ

is zero,

\hat{c} (Φ)

is taken as an arbitrary large number.)

The procedure selects the interval

h_{ϕ}

(to be used in computing

Φ

) from a sequence of trial intervals

(h_{k})

. The initial trial interval is taken as

10 \bar{h}

, where

\bar{h} = 2 (1 + |x|) \sqrt{e_{R}}

unless you specify the initial value to be used.

The value of

\hat{c} (Φ)

for a trial value

h_{k}

is defined as ‘acceptable’ if it lies in the interval

[0.001, 0.1]

. In this case

h_{ϕ}

is taken as

h_{k}

, and the current value of

Φ

is used to compute

h_{F}

from (1). If

\hat{c} (Φ)

is unacceptable, the next trial interval is chosen so that the relative condition error bound will either decrease or increase, as required. If the bound on the relative condition error is too large, a larger interval is used as the next trial value in an attempt to reduce the condition error bound. On the other hand, if the relative condition error bound is too small,

h_{k}

is reduced.

The procedure will fail to produce an acceptable value of

\hat{c} (Φ)

in two situations. Firstly, if

f^{''} (x)

is extremely small, then

\hat{c} (Φ)

may never become small, even for a very large value of the interval. Alternatively,

\hat{c} (Φ)

may never exceed

0.001

, even for a very small value of the interval. This usually implies that

f^{''} (x)

is extremely large, and occurs most often near a singularity.

As a check on the validity of the estimated first derivative, the procedure provides a comparison of the forward-difference approximation computed with

h_{F}

(as above) and the central-difference approximation computed with

h_{ϕ}

. Using the central-difference formula the first derivative can be approximated by

ρ_{c} (f, h) = \frac{f (x + h) - f (x - h)}{2 h}

where

h > 0

. If the values

h_{F}

and

h_{ϕ}

do not display some agreement, neither can be considered reliable.

When both function and gradients are available and you require the Hessian matrix (i.e.,

mode = 1

) e04xaf/e04xaa follows a similar procedure to the case above with the exception that the gradient function

g (x)

is substituted for the objective function and so the forward-difference interval for the first derivative of

g (x)

with respect to variable

x_{j}

is computed. The

j

th column of the approximate Hessian matrix is then defined as in Chapter 2 of Gill et al. (1981), by

\frac{g (x + h_{j} e_{j}) - g (x)}{h_{j}}

where

h_{j}

is the best forward-difference interval associated with the

j

th component of

g

and

e_{j}

is the vector with unity in the

j

th position and zeros elsewhere.

When only the objective function is available and you require the gradients and Hessian matrix (i.e.,

mode = 2

) e04xaf/e04xaa again follows the same procedure as the case for

mode = 0

except that this time the value of

\hat{c} (Φ)

for a trial value

h_{k}

is defined as acceptable if it lies in the interval

[0.0001, 0.01]

and the initial trial interval is taken as

\bar{h} = 2 (1 + |x|) \sqrt[4]{e_{R}} .

The approximate Hessian matrix

G

is then defined as in Chapter 2 of Gill et al. (1981), by

G_{i j} (x) = \frac{1}{h_{i} h_{j}} (f (x + h_{i} e_{i} + h_{j} e_{j}) - f (x + h_{i} e_{i}) - f (x + h_{j} e_{j}) + f (x)) .

Gill P E, Murray W, Saunders M A and Wright M H (1983a) Documentation for FDCALC and FDCORE Technical Report SOL 83–6 Stanford University

Gill P E, Murray W, Saunders M A and Wright M H (1983b) Computing forward-difference intervals for numerical optimization SIAM J. Sci. Statist. Comput. 4 310–321

Gill P E, Murray W and Wright M H (1981) Practical Optimization Academic Press

On exit from e04xaf/e04xaa both diagnostic arguments info and ifail should be tested. ifail represents an overall diagnostic indicator, whereas the integer array info represents diagnostic information on each variable.

If on entry

ifail = 0

or

- 1

, explanatory error messages are output on the current error message unit (as defined by x04aaf).

Errors or warnings detected by the routine:

Diagnostic information returned via info is as follows:

If

ifail = 0

on exit the algorithm terminated successfully, i.e., the forward-difference estimates of the appropriate first derivatives (computed with the final estimate of the ‘optimal’ forward-difference interval

h_{F}

) and the central-difference estimates (computed with the interval

h_{ϕ}

used to compute the final estimate of the second derivative) agree to at least half a decimal place.

In short word length implementations when computing the full Hessian matrix given function values only (i.e.,

mode = 2

) the elements of the computed Hessian will have at best

1

to

2

figures of accuracy.

e04xaf/e04xaa is not threaded in any implementation.

To evaluate an acceptable set of finite difference intervals for a well-scaled problem, the routine will require around two function evaluations per variable; in a badly scaled problem however, as many as six function evaluations per variable may be needed.

If you request the full Hessian matrix supplying both function and gradients (i.e.,

mode = 1

) or function only (i.e.,

mode = 2

) then a further n or

3 \times n \times (n + 1) / 2

function evaluations respectively are required.

The following is a description of the printed output from e04xaf/e04xaa as controlled by the argument msglvl.

Output when

msglvl = 1

is as follows:

J	number of variable for which the difference interval has been computed.
$X (j)$	$j$ th variable of $x$ as set by you.
F. dif. int.	the best interval found for computing a forward-difference approximation to the appropriate partial derivative with respect to the $j$ th variable.
C. dif. int.	the best interval found for computing a central-difference approximation to the appropriate partial derivative with respect to the $j$ th variable.
Error est.	a bound on the estimated error in the final forward-difference approximation. When $info (j) = 1$ , Error est. is set to zero.
Grad. est.	best estimate of the first partial derivative with respect to the $j$ th variable.
Hess diag est.	best estimate of the second partial derivative with respect to the $j$ th variable.
fun evals.	the number of function evaluations used to compute the final difference intervals for the $j$ th variable.
$info (j)$	the value of info for the $j$ th variable.

This example computes the gradient vector and the Hessian matrix of the following function:

F (x) = {(x_{1} + 10 x_{2})}^{2} + 5 {(x_{3} - x_{4})}^{2} + {(x_{2} - 2 x_{3})}^{4} + 10 {(x_{1} - x_{4})}^{4}

at the point

(2, - 1, 1, 1)

.

Note: the following programs illustrate the use of e04xaf and e04xaa.

None.

Value	Definition
0	No printout
1	A summary is printed out for each variable plus any warning messages.
Other	Values other than $0$ and $1$ should normally be used only at the direction of NAG.

NAG Library Routine Document

e04xaf (estimate_deriv_old)
e04xaa (estimate_deriv)

▸▿ Contents

1

Purpose

2

Specification

2.1

Specification for e04xaf

2.2

Specification for e04xaa

3

Description

4

References

5

Arguments

6

Error Indicators and Warnings

7

Accuracy

8

Parallelism and Performance

9

Further Comments

9.1

Description of the Printed Output

10

Example

10.1

Program Text

10.2

Program Data

10.3

Program Results

NAG Library Routine Document

e04xaf (estimate_deriv_old)e04xaa (estimate_deriv)

▸▿ Contents

1 Purpose

2 Specification

2.1 Specification for e04xaf

2.2 Specification for e04xaa

3 Description

4 References

5 Arguments

6 Error Indicators and Warnings

7 Accuracy

8 Parallelism and Performance

9 Further Comments

9.1 Description of the Printed Output

10 Example

10.1 Program Text

10.2 Program Data

10.3 Program Results

e04xaf (estimate_deriv_old)
e04xaa (estimate_deriv)

1

Purpose

2

Specification

2.1

Specification for e04xaf

2.2

Specification for e04xaa

3

Description

4

References

5

Arguments

6

Error Indicators and Warnings

7

Accuracy

8

Parallelism and Performance

9

Further Comments

9.1

Description of the Printed Output

10

Example

10.1

Program Text

10.2

Program Data

10.3

Program Results