NAG CL Interface
e04xac (estimate_deriv)
1
Purpose
e04xac computes an approximation to the gradient vector and/or the Hessian matrix for use in conjunction with, or following the use of an optimization function (such as
e04ucc).
2
Specification
void |
e04xac (Integer n,
double x[],
void |
(*objfun)(Integer n,
const double x[],
double *objf,
double g[],
Nag_Comm *comm),
|
|
double *objf,
double g[],
double h_forward[],
double h_central[],
double hess[],
Integer tdhess,
Nag_DerivInfo *deriv_info,
Nag_E04_Opt *options,
Nag_Comm *comm,
NagError *fail) |
|
The function may be called by the names: e04xac or nag_opt_estimate_deriv.
3
Description
e04xac is based on the routine FDCALC described in
Gill et al. (1983a). It computes finite difference approximations to the gradient vector and the Hessian matrix for a given function, and aims to provide sufficiently accurate estimates for use with an optimization algorithm.
The simplest approximation of the gradients involves the forward-difference formula, in which the derivative of
of a univariate function
is approximated by the quantity
for some interval
, where the subscript ‘F’ denotes ‘forward-difference’ (see
Gill et al. (1983b)).
The choice of which gradients are returned by
e04xac is controlled by the optional parameter
(see
Section 11 for a description of this argument). To summarise the procedure used by
e04xac when
(default value) (i.e., for the case when the objective function is available and you require estimates of gradient values and the full Hessian matrix) consider a univariate function
at the point
. (In order to obtain the gradient of a multivariate function
, where
is an
-vector, the procedure is applied to each component of
, keeping the other components fixed.) Roughly speaking, the method is based on the fact that the bound on the relative truncation error in the forward-difference approximation tends to be an increasing function of
, while the relative condition error bound is generally a decreasing function of
, hence changes in
will tend to have opposite effects on these errors (see
Gill et al. (1983b)).
The ‘best’ interval
is given by
where
is an estimate of
, and
is an estimate of the relative error associated with computing the function (see Chapter 8 of
Gill et al. (1981)). Given an interval
,
is defined by the second-order approximation
The decision as to whether a given value of
is acceptable involves
, the following bound on the relative condition error in
:
(When
is zero,
is taken as an arbitrary large number.)
The procedure selects the interval
(to be used in computing
) from a sequence of trial intervals
. The initial trial interval is taken as
unless you specify the initial value to be used.
The value of
for a trial value
is defined as ‘acceptable’ if it lies in the interval
. In this case
is taken as
, and the current value of
is used to compute
from
(1). If
is unacceptable, the next trial interval is chosen so that the relative condition error bound will either decrease or increase, as required. If the bound on the relative condition error is too large, a larger interval is used as the next trial value in an attempt to reduce the condition error bound. On the other hand, if the relative condition error bound is too small,
is reduced.
The procedure will fail to produce an acceptable value of in two situations. Firstly, if is extremely small, then may never become small, even for a very large value of the interval. Alternatively, may never exceed , even for a very small value of the interval. This usually implies that is extremely large, and occurs most often near a singularity.
As a check on the validity of the estimated first derivative, the procedure provides a comparison of the forward-difference approximation computed with
(as above) and the central-difference approximation computed with
. Using the central-difference formula the first derivative can be approximated by
where
. If the values
and
do not display some agreement, neither can be considered reliable.
The approximate Hessian matrix
is defined as in Chapter 2 of
Gill et al. (1981), by
where
is the best forward-difference interval associated with the
th component of
and
is the vector with unity in the
th position and zeros elsewhere.
If you require the gradients and only the diagonal of the Hessian matrix (i.e.,
; see
Section 11.2),
e04xac follows a similar procedure to the default case, except that the initial trial interval is taken as
, where
and the value of
for a trial value
is defined as acceptable if it lies in the interval
. The elements of the Hessian diagonal which are returned in this case are the values of
corresponding to the ‘best’ intervals.
When both function and gradients are available and you require the Hessian matrix (i.e.,
; see
Section 11.2),
e04xac follows a similar procedure to the case above with the exception that the gradient function
is substituted for the objective function and so the forward-difference interval for the first derivative of
with respect to variable
is computed. The
th column of the approximate Hessian matrix is then defined as in Chapter 2 of
Gill et al. (1981), by
where
is the best forward-difference interval associated with the
th component of
.
4
References
Gill P E, Murray W, Saunders M A and Wright M H (1983a) Documentation for FDCALC and FDCORE Technical Report SOL 83–6 Stanford University
Gill P E, Murray W, Saunders M A and Wright M H (1983b) Computing forward-difference intervals for numerical optimization SIAM J. Sci. Statist. Comput. 4 310–321
Gill P E, Murray W and Wright M H (1981) Practical Optimization Academic Press
5
Arguments
-
1:
– Integer
Input
-
On entry: the number of variables.
Constraint:
.
-
2:
– double
Input
-
On entry: the point at which derivatives are required.
-
3:
– function, supplied by the user
External Function
-
objfun must evaluate the objective function
and (optionally) its gradient
for a specified
element vector
.
The specification of
objfun is:
void |
objfun (Integer n,
const double x[],
double *objf,
double g[],
Nag_Comm *comm)
|
|
-
1:
– Integer
Input
-
On entry: the number of variables.
-
2:
– const double
Input
-
On entry: the point at which the value of and, if , the , are required.
-
3:
– double *
Output
-
On exit:
objfun must set
objf to the value of the objective function
at the current point
. If it is not possible to evaluate
then
objfun should assign a negative value to
;
e04xac will then terminate.
-
4:
– double
Output
-
On exit: if
on entry, then
objfun must set
to the value of the first derivative
at the current point
, for
. If it is not possible to evaluate the first derivatives then
objfun should assign a negative value to
;
e04xac will then terminate.
If
on entry, then
g is not referenced.
-
5:
– Nag_Comm *
-
Pointer to structure of type Nag_Comm; the following members are relevant to
objfun.
- flag – IntegerInput/Output
-
On entry: will be set to 0 or . The value 0 indicates that only itself needs to be evaluated. The value 2 indicates that both and its first derivatives must be calculated.
On exit: if
objfun resets
to a negative number then
e04xac will terminate immediately with the error indicator
NE_USER_STOP. If
fail is supplied to
e04xac,
will be set to the user's setting of
.
- first – Nag_BooleanInput
-
On entry: will be set to Nag_TRUE on the first call to
objfun and Nag_FALSE for all subsequent calls.
- nf – IntegerInput
-
On entry: the number of evaluations of the objective function; this value will be equal to the number of calls made to
objfun (including the current one).
- user – double *
- iuser – Integer *
- p – Pointer
-
The type Pointer will be void * with a C compiler that defines void * and char * otherwise.
Before calling
e04xac these pointers may be allocated memory and initialized with various quantities for use by
objfun when called from
e04xac.
Note: objfun should not return floating-point NaN (Not a Number) or infinity values, since these are not handled by
e04xac. If your code inadvertently
does return any NaNs or infinities,
e04xac is likely to produce unexpected results.
Note: objfun should be thoroughly tested before being used in conjunction with
e04xac. The array
x must
not be changed by
objfun.
-
4:
– double *
Output
-
On exit: the value of the objective function evaluated at the input vector in
x.
-
5:
– double
Output
-
On exit: if
(the default; see
Section 11.2) or
,
contains the best estimate of the first partial derivative for the
th variable,
. If
,
contains the first partial derivative for the
th variable as evaluated by
objfun.
-
6:
– double
Input/Output
-
On entry: if the optional parameter
(the default; see
Section 11.2), the values contained in
h_forward on entry to
e04xac are ignored.
If
,
h_forward is assumed to contain meaningful values on entry: if
then it is used as the initial trial interval for computing the appropriate partial derivative to the
th variable,
; if
, then the initial trial interval for the
th variable is computed by
e04xac (see
Section 11.2).
On exit: is the best interval found for computing a forward-difference approximation to the appropriate partial derivative for the th variable. If you do not require this information, a NULL pointer may be provided, and e04xac will allocate memory internally to calculate the difference intervals.
Constraint:
h_forward must not be
NULL if
.
-
7:
– double
Output
-
On exit: is the best interval found for computing a central-difference approximation to the appropriate partial derivative for the th variable. If you do not require this information, a NULL pointer may be provided, and e04xac will allocate memory internally to calculate the difference intervals.
-
8:
– double
Output
-
Note: the th element of the matrix is stored in .
On exit: if the optional parameter
(the default; see
Section 11.2) or
, the estimated Hessian matrix is contained in the leading
by
part of this array. If
, the
elements of the estimated Hessian diagonal are contained in the first row of this array.
-
9:
– Integer
Input
-
On entry: the stride separating matrix column elements in the array
hess.
Constraint:
.
-
10:
– Nag_DerivInfo *
Output
-
On exit:
contains diagnostic information on the
th variable, for
.
- No unusual behaviour observed in estimating the appropriate derivative.
- The appropriate function appears to be constant.
- The appropriate function appears to be linear or odd.
- The second derivative of the appropriate function appears to be so large that it cannot be reliably estimated (e.g., near a singularity).
- The forward-difference and central-difference estimates of the appropriate first derivatives do not agree to half a decimal place; this usually occurs because the first derivative is small.
A more detailed explanation of these warnings is given in
Section 9.1.
-
11:
– Nag_E04_Opt *
Input/Output
-
On entry/exit: a pointer to a structure of type Nag_E04_Opt whose members are optional parameters for
e04xac. These structure members offer the means of adjusting some of the argument values of the computation and on output will supply further details of the results. A description of the members of
options is given in
Section 11.
If any of these optional parameters are required then the structure
options should be declared and initialized by a call to
e04xxc and supplied as an argument to
e04xac. However, if the optional parameters are not required the NAG defined null pointer,
E04_DEFAULT, can be used in the function call.
-
12:
– Nag_Comm *
Input/Output
-
Note: comm is a NAG defined type (see
Section 3.1.1 in the Introduction to the NAG Library CL Interface).
On entry/exit: structure containing pointers for communication with user-supplied functions; see the description of
objfun for details. If you do not need to make use of this communication feature, the null pointer
NAGCOMM_NULL may be used in the call to
e04xac;
comm will then be declared internally for use in calls to user-supplied functions.
-
13:
– NagError *
Input/Output
-
The NAG error argument (see
Section 7 in the Introduction to the NAG Library CL Interface).
6
Error Indicators and Warnings
- NE_2_INT_ARG_LT
-
On entry, while . These arguments must satisfy .
- NE_ALLOC_FAIL
-
Dynamic memory allocation failed.
- NE_BAD_PARAM
-
On entry, argument had an illegal value.
On entry, argument had an illegal value.
- NE_H_FORWARD_NULL
-
but argument
h_forward is
NULL.
- NE_INT_ARG_LT
-
On entry, .
Constraint: .
- NE_INVALID_REAL_RANGE_F
-
Value given to is not valid. Correct range is .
- NE_NOT_APPEND_FILE
-
Cannot open file for appending.
- NE_NOT_CLOSE_FILE
-
Cannot close file .
- NE_OPT_NOT_INIT
-
Options structure not initialized.
- NE_USER_STOP
-
User requested termination, user flag value .
This exit occurs if you set
to a negative value in
objfun. If
fail is supplied, the value of
will be the same as your setting of
.
- NE_WRITE_ERROR
-
Error occurred when writing to file .
- NW_DERIV_INFO
-
On exit, at least one element of the
deriv_info array does not contain the value
. This does not necessarily represent an unsuccessful exit.
See
Section 9.1 for information about the possible values which may be returned in
deriv_info.
7
Accuracy
e04xac exits with if the algorithm terminated successfully, i.e., the forward-difference estimates of the appropriate first derivatives (computed with the final estimate of the ‘optimal’ forward-difference interval ) and the central-difference estimates (computed with the interval used to compute the final estimate of the second derivative) agree to at least half a decimal place.
8
Parallelism and Performance
e04xac is not threaded in any implementation.
9.1
Diagnostic Information
Diagnostic information is returned via the array argument
deriv_info. If
on exit then
, for
. If
on exit, then, for at least one
,
contains one of the following values:
- The appropriate function appears to be constant. On exit, is set to the initial trial interval corresponding to a well scaled problem, and Error est in the printed output is set to zero. This value occurs when the estimated relative condition error in the first derivative approximation is unacceptably large for every value of the finite difference interval. If this happens when the function is not constant the initial interval may be too small; in this case, it may be worthwhile to rerun e04xac with larger initial trial interval values supplied in h_forward and with the optional parameter set to Nag_TRUE. This error may also occur if the function evaluation includes an inordinately large constant term or if optional parameter is too large.
- The appropriate function appears to be linear or odd. On exit, is set to the smallest interval with acceptable bounds on the relative condition error in the forward- and backward-difference estimates. In this case, the estimated relative condition error in the second derivative approximation remained large for every trial interval, but the estimated error in the first derivative approximation was acceptable for at least one interval. If the function is not linear or odd the relative condition error in the second derivative may be decreasing very slowly. It may be worthwhile to rerun e04xac with larger initial trial interval values supplied in h_forward and with set to Nag_TRUE.
- The second derivative of the appropriate function appears to be so large that it cannot be reliably estimated (e.g., near a singularity). On exit, is set to the smallest trial interval.
This value occurs when the relative condition error estimate in the second derivative remained very small for every trial interval.
If the second derivative is not large the relative condition error in the second derivative may be increasing very slowly. It may be worthwhile to rerun
e04xac with smaller initial trial interval values supplied in
h_forward and with
set to Nag_TRUE. This error may also occur when the given value of the optional parameter
is not a good estimate of a bound on the absolute error in the appropriate function (i.e.,
is too small).
- The algorithm terminated with an apparently acceptable estimate of the second derivative. However the forward-difference estimates of the appropriate first derivatives (computed with the final estimate of the ‘optimal’ forward-difference interval) and the central difference estimates (computed with the interval used to compute the final estimate of the second derivative) do not agree to half a decimal place. The usual reason that the forward- and central-difference estimates fail to agree is that the first derivative is small.
If the first derivative is not small, it may be helpful to run e04xac at a different point.
9.2
Timing
Unless the objective function can be evaluated very quickly, the run time will usually be dominated by the time spent in
objfun.
To evaluate an acceptable set of finite difference intervals for a well-scaled problem e04xac will use around two function evaluations per variable; in a badly scaled problem, six function evaluations per variable may be needed.
In the default case where gradients and the full Hessian matrix are required (i.e., optional parameter ), e04xac performs a further function evaluations. If the full Hessian matrix is required, with you supplying both function and gradients (i.e., ), a further function evaluations are performed.
10
Example
The example program computes the gradient vector and Hessian matrix of the following function:
at the point
.
This example shows the use of some optional parameters which are discussed fully in
Section 11.
The same
objfun is used as in
Section 10 and the derivatives are estimated at the same point. The
options structure is declared and initialized by
e04xxc. Two options are set to suppress all printout from
e04xac:
is set to Nag_FALSE and
.
and
e04xac is called. The returned function value and estimated derivative values are printed out and
is reset to
before
e04xac is called again. On return, the computed function value and gradient, and estimated Hessian, are printed out.
10.1
Program Text
10.2
Program Data
None.
10.3
Program Results
11
Optional Parameters
A number of optional input and output arguments to
e04xac are available through the structure argument
options, type Nag_E04_Opt. An argument may be selected by assigning an appropriate value to the relevant structure member; those arguments not selected will be assigned default values. If no use is to be made of any of the optional parameters you should use the NAG defined null pointer,
E04_DEFAULT, in place of
options when calling
e04xac; the default settings will then be used for all arguments.
Before assigning values to
options directly the structure
must be initialized by a call to the function
e04xxc. Values may then be assigned to the structure members in the normal C manner.
After return from
e04xac, the
options structure may only be re-used for future calls of
e04xac if the dimensions of the new problem are the same. Otherwise, the structure must be cleared by a call of
e04xzc) and re-initialized by a call of
e04xxc before future calls. Failure to do this will result in unpredictable behaviour.
Option settings may also be read from a text file using the function
e04xyc in which case initialization of the
options structure will be performed automatically if not already done. Any subsequent direct assignment to the
options structure must
not be preceded by initialization.
11.1
Optional Parameter Checklist and Default Values
For easy reference, the following list shows the members of
options which are valid for
e04xac together with their default values where relevant. The number
is a generic notation for
machine precision (see
X02AJC).
Boolean list |
Nag_TRUE |
Nag_DPrintType print_deriv |
|
char outfile[512] |
stdout |
Nag_DWantType deriv_want |
|
Boolean use_hfwd_init |
Nag_FALSE |
double f_prec |
|
double f_prec_used |
Integer nf |
11.2
Description of the Optional Parameters
list – Nag_Boolean | | Default |
On entry: if the argument settings in the call to e04xac will be printed.
print_deriv – Nag_DPrintType | | Default |
On entry: controls whether printout is produced by
e04xac. The following values are available:
|
No output. |
|
Printout for each variable as described in Section 5. |
Constraint:
or .
outfile – const char[512] | | Default |
On entry: the name of the file to which results should be printed. If then the stdout stream is used.
deriv_want – Nag_DWantType | | Default |
On entry: specifies which derivatives
e04xac should estimate. The following values are available:
|
Estimate the gradient and full Hessian, with you supplying the objective function via objfun. |
|
Estimate the gradient and the Hessian diagonal values, with you supplying the objective function via objfun. |
|
Estimate the full Hessian, with you supplying the objective function and gradients via objfun. |
Constraint:
, or .
use_hfwd_init – Nag_Boolean | | Default |
On entry: if
, then
e04xac ignores any values supplied on entry in
h_forward, and computes the initial trial intervals itself. If
, then
e04xac uses the forward difference interval provided in
as the initial trial interval for computing the appropriate partial derivative to the
th variable,
; however, if
for some
, the initial trial interval for the
th variable is computed by
e04xac.
f_prec – double | | Default |
On entry: specifies
, which is intended to measure the accuracy with which the problem function
can be computed. The value of
should reflect the relative precision of
, i.e., acts as a relative precision when
is large, and as an absolute precision when
is small. For example, if
is typically of order 1000 and the first six significant figures are known to be correct, an appropriate value of
would be
. The default value of
will be appropriate for most simple functions that are computed with full accuracy
A discussion of
is given in Chapter 8 of
Gill et al. (1981). If you provide a value of
which
e04xac determines to be either too small or too large, the default value will be used instead and a warning will be output if optional parameter
. The value actually used is returned in
.
Constraint:
.
On exit: if
or
NW_DERIV_INFO, or if
and
, then
contains the value of
used by
e04xac. If you supply a value for
and
e04xac considers that the value supplied is neither too large nor too small, then this value will be returned in
; otherwise
will contain the default value,
.
On exit: the number of times the objective function has been evaluated (i.e., number of calls of
objfun).
11.3
Description of Printed Output
Results from
e04xac are printed out by default. The level of printed output can be controlled with the structure members
and
(see
Section 11.2). If
then the argument values to
e04xac are listed, whereas printout of results is governed by the value of
.
The default,
provides the following line of output for each variable.
j |
the index of the variable for which the difference interval has been computed. |
X(j) |
the value of as provided in . |
Fwd diff int |
the best interval found for computing a forward-difference approximation to the appropriate partial derivative with respect to . |
Cent diff int |
the best interval found for computing a central-difference approximation to the appropriate partial derivative with respect to . |
Error est |
a bound on the estimated error in the final forward-difference approximation. When , Error est is set to zero. |
Grad est |
best estimate of the first partial derivative with respect to . |
Hess diag est |
best estimate of the second partial derivative with respect to . |
Nfun |
the number of function evaluations used to compute the final difference intervals for . |
Info |
gives diagnostic information for . Info will be one of OK, Constant?, Linear or odd?, Large 2nd deriv?, or Small 1st deriv?, corresponding to , , , or , respectively. |