e04xac computes an approximation to the gradient vector and/or the Hessian matrix for use in conjunction with, or following the use of an optimization function (such as e04ucc).
The function may be called by the names: e04xac or nag_opt_estimate_deriv.
3Description
e04xac is based on the routine FDCALC described in Gill et al. (1983a). It computes finite difference approximations to the gradient vector and the Hessian matrix for a given function, and aims to provide sufficiently accurate estimates for use with an optimization algorithm.
The simplest approximation of the gradients involves the forward-difference formula, in which the derivative of of a univariate function is approximated by the quantity
for some interval , where the subscript ‘F’ denotes ‘forward-difference’ (see Gill et al. (1983b)).
The choice of which gradients are returned by e04xac is controlled by the optional parameter (see Section 11 for a description of this argument). To summarise the procedure used by e04xac when (default value) (i.e., for the case when the objective function is available and you require estimates of gradient values and the full Hessian matrix) consider a univariate function at the point . (In order to obtain the gradient of a multivariate function , where is an -vector, the procedure is applied to each component of , keeping the other components fixed.) Roughly speaking, the method is based on the fact that the bound on the relative truncation error in the forward-difference approximation tends to be an increasing function of , while the relative condition error bound is generally a decreasing function of , hence changes in will tend to have opposite effects on these errors (see Gill et al. (1983b)).
The ‘best’ interval is given by
(1)
where is an estimate of , and is an estimate of the relative error associated with computing the function (see Chapter 8 of Gill et al. (1981)). Given an interval , is defined by the second-order approximation
The decision as to whether a given value of is acceptable involves , the following bound on the relative condition error in :
(When is zero, is taken as an arbitrary large number.)
The procedure selects the interval (to be used in computing ) from a sequence of trial intervals . The initial trial interval is taken as
unless you specify the initial value to be used.
The value of for a trial value is defined as ‘acceptable’ if it lies in the interval . In this case is taken as , and the current value of is used to compute from (1). If is unacceptable, the next trial interval is chosen so that the relative condition error bound will either decrease or increase, as required. If the bound on the relative condition error is too large, a larger interval is used as the next trial value in an attempt to reduce the condition error bound. On the other hand, if the relative condition error bound is too small, is reduced.
The procedure will fail to produce an acceptable value of in two situations. Firstly, if is extremely small, then may never become small, even for a very large value of the interval. Alternatively, may never exceed , even for a very small value of the interval. This usually implies that is extremely large, and occurs most often near a singularity.
As a check on the validity of the estimated first derivative, the procedure provides a comparison of the forward-difference approximation computed with (as above) and the central-difference approximation computed with . Using the central-difference formula the first derivative can be approximated by
where . If the values and do not display some agreement, neither can be considered reliable.
The approximate Hessian matrix is defined as in Chapter 2 of Gill et al. (1981), by
where is the best forward-difference interval associated with the th component of and is the vector with unity in the th position and zeros elsewhere.
If you require the gradients and only the diagonal of the Hessian matrix (i.e., ; see Section 11.2), e04xac follows a similar procedure to the default case, except that the initial trial interval is taken as , where
and the value of for a trial value is defined as acceptable if it lies in the interval . The elements of the Hessian diagonal which are returned in this case are the values of corresponding to the ‘best’ intervals.
When both function and gradients are available and you require the Hessian matrix (i.e., ; see Section 11.2), e04xac follows a similar procedure to the case above with the exception that the gradient function is substituted for the objective function and so the forward-difference interval for the first derivative of with respect to variable is computed. The th column of the approximate Hessian matrix is then defined as in Chapter 2 of Gill et al. (1981), by
where is the best forward-difference interval associated with the th component of .
4References
Gill P E, Murray W, Saunders M A and Wright M H (1983a) Documentation for FDCALC and FDCORE Technical Report SOL 83–6 Stanford University
Gill P E, Murray W, Saunders M A and Wright M H (1983b) Computing forward-difference intervals for numerical optimization SIAM J. Sci. Statist. Comput.4 310–321
Gill P E, Murray W and Wright M H (1981) Practical Optimization Academic Press
5Arguments
1: – IntegerInput
On entry: the number of variables.
Constraint:
.
2: – doubleInput
On entry: the point at which derivatives are required.
3: – function, supplied by the userExternal Function
objfun must evaluate the objective function and (optionally) its gradient for a specified element vector .
On entry: the point at which the value of and, if , the , are required.
3: – double *Output
On exit: objfun must set objf to the value of the objective function at the current point . If it is not possible to evaluate then objfun should assign a negative value to ; e04xac will then terminate.
4: – doubleOutput
On exit: if on entry, then objfun must set to the value of the first derivative at the current point , for . If it is not possible to evaluate the first derivatives then objfun should assign a negative value to ; e04xac will then terminate.
Pointer to structure of type Nag_Comm; the following members are relevant to objfun.
flag – IntegerInput/Output
On entry: will be set to 0 or . The value 0 indicates that only itself needs to be evaluated. The value 2 indicates that both and its first derivatives must be calculated.
On exit: if objfun resets to a negative number then e04xac will terminate immediately with the error indicator NE_USER_STOP. If fail is supplied to e04xac, will be set to the user's setting of .
first – Nag_BooleanInput
On entry: will be set to Nag_TRUE on the first call to objfun and Nag_FALSE for all subsequent calls.
nf – IntegerInput
On entry: the number of evaluations of the objective function; this value will be equal to the number of calls made to objfun (including the current one).
user – double *
iuser – Integer *
p – Pointer
The type Pointer will be void * with a C compiler that defines void * and char * otherwise.
Before calling e04xac these pointers may be allocated memory and initialized with various quantities for use by objfun when called from e04xac.
Note:objfun should not return floating-point NaN (Not a Number) or infinity values, since these are not handled by e04xac. If your code inadvertently does return any NaNs or infinities, e04xac is likely to produce unexpected results.
Note:objfun should be thoroughly tested before being used in conjunction with e04xac. The array x must not be changed by objfun.
4: – double *Output
On exit: the value of the objective function evaluated at the input vector in x.
5: – doubleOutput
On exit: if (the default; see Section 11.2) or , contains the best estimate of the first partial derivative for the th variable, . If , contains the first partial derivative for the th variable as evaluated by objfun.
6: – doubleInput/Output
On entry: if the optional parameter (the default; see Section 11.2), the values contained in h_forward on entry to e04xac are ignored.
If , h_forward is assumed to contain meaningful values on entry: if then it is used as the initial trial interval for computing the appropriate partial derivative to the th variable, ; if , then the initial trial interval for the th variable is computed by e04xac (see Section 11.2).
On exit: is the best interval found for computing a forward-difference approximation to the appropriate partial derivative for the th variable. If you do not require this information, a NULL pointer may be provided, and e04xac will allocate memory internally to calculate the difference intervals.
On exit: is the best interval found for computing a central-difference approximation to the appropriate partial derivative for the th variable. If you do not require this information, a NULL pointer may be provided, and e04xac will allocate memory internally to calculate the difference intervals.
8: – doubleOutput
Note: the th element of the matrix is stored in .
On exit: if the optional parameter (the default; see Section 11.2) or , the estimated Hessian matrix is contained in the leading part of this array. If , the elements of the estimated Hessian diagonal are contained in the first row of this array.
9: – IntegerInput
On entry: the stride separating matrix column elements in the array hess.
Constraint:
.
10: – Nag_DerivInfo *Output
On exit: contains diagnostic information on the th variable, for .
No unusual behaviour observed in estimating the appropriate derivative.
The appropriate function appears to be constant.
The appropriate function appears to be linear or odd.
The second derivative of the appropriate function appears to be so large that it cannot be reliably estimated (e.g., near a singularity).
The forward-difference and central-difference estimates of the appropriate first derivatives do not agree to half a decimal place; this usually occurs because the first derivative is small.
A more detailed explanation of these warnings is given in Section 9.1.
11: – Nag_E04_Opt *Input/Output
On entry/exit: a pointer to a structure of type Nag_E04_Opt whose members are optional parameters for e04xac. These structure members offer the means of adjusting some of the argument values of the computation and on output will supply further details of the results. A description of the members of options is given in Section 11.
If any of these optional parameters are required then the structure options should be declared and initialized by a call to e04xxc and supplied as an argument to e04xac. However, if the optional parameters are not required the NAG defined null pointer, E04_DEFAULT, can be used in the function call.
12: – Nag_Comm *Input/Output
Note:comm is a NAG defined type (see Section 3.1.1 in the Introduction to the NAG Library CL Interface).
On entry/exit: structure containing pointers for communication with user-supplied functions; see the description of objfun for details. If you do not need to make use of this communication feature, the null pointer NAGCOMM_NULL may be used in the call to e04xac; comm will then be declared internally for use in calls to user-supplied functions.
13: – NagError *Input/Output
The NAG error argument (see Section 7 in the Introduction to the NAG Library CL Interface).
This exit occurs if you set to a negative value in objfun. If fail is supplied, the value of will be the same as your setting of .
NE_WRITE_ERROR
Error occurred when writing to file .
NW_DERIV_INFO
On exit, at least one element of the deriv_info array does not contain the value . This does not necessarily represent an unsuccessful exit.
See Section 9.1 for information about the possible values which may be returned in deriv_info.
7Accuracy
e04xac exits with if the algorithm terminated successfully, i.e., the forward-difference estimates of the appropriate first derivatives (computed with the final estimate of the ‘optimal’ forward-difference interval ) and the central-difference estimates (computed with the interval used to compute the final estimate of the second derivative) agree to at least half a decimal place.
8Parallelism and Performance
Background information to multithreading can be found in the Multithreading documentation.
e04xac is not threaded in any implementation.
9Further Comments
9.1Diagnostic Information
Diagnostic information is returned via the array argument deriv_info. If on exit then , for . If on exit, then, for at least one , contains one of the following values:
The appropriate function appears to be constant. On exit, is set to the initial trial interval corresponding to a well scaled problem, and Error est in the printed output is set to zero. This value occurs when the estimated relative condition error in the first derivative approximation is unacceptably large for every value of the finite difference interval. If this happens when the function is not constant the initial interval may be too small; in this case, it may be worthwhile to rerun e04xac with larger initial trial interval values supplied in h_forward and with the optional parameter set to Nag_TRUE. This error may also occur if the function evaluation includes an inordinately large constant term or if optional parameter is too large.
The appropriate function appears to be linear or odd. On exit, is set to the smallest interval with acceptable bounds on the relative condition error in the forward- and backward-difference estimates. In this case, the estimated relative condition error in the second derivative approximation remained large for every trial interval, but the estimated error in the first derivative approximation was acceptable for at least one interval. If the function is not linear or odd the relative condition error in the second derivative may be decreasing very slowly. It may be worthwhile to rerun e04xac with larger initial trial interval values supplied in h_forward and with set to Nag_TRUE.
The second derivative of the appropriate function appears to be so large that it cannot be reliably estimated (e.g., near a singularity). On exit, is set to the smallest trial interval.
This value occurs when the relative condition error estimate in the second derivative remained very small for every trial interval.
If the second derivative is not large the relative condition error in the second derivative may be increasing very slowly. It may be worthwhile to rerun e04xac with smaller initial trial interval values supplied in h_forward and with set to Nag_TRUE. This error may also occur when the given value of the optional parameter is not a good estimate of a bound on the absolute error in the appropriate function (i.e., is too small).
The algorithm terminated with an apparently acceptable estimate of the second derivative. However the forward-difference estimates of the appropriate first derivatives (computed with the final estimate of the ‘optimal’ forward-difference interval) and the central difference estimates (computed with the interval used to compute the final estimate of the second derivative) do not agree to half a decimal place. The usual reason that the forward- and central-difference estimates fail to agree is that the first derivative is small.
If the first derivative is not small, it may be helpful to run e04xac at a different point.
9.2Timing
Unless the objective function can be evaluated very quickly, the run time will usually be dominated by the time spent in objfun.
To evaluate an acceptable set of finite difference intervals for a well-scaled problem e04xac will use around two function evaluations per variable; in a badly scaled problem, six function evaluations per variable may be needed.
In the default case where gradients and the full Hessian matrix are required (i.e., optional parameter ), e04xac performs a further function evaluations. If the full Hessian matrix is required, with you supplying both function and gradients (i.e., ), a further function evaluations are performed.
10Example
The example program computes the gradient vector and Hessian matrix of the following function:
at the point .
This example shows the use of some optional parameters which are discussed fully in Section 11.
The same objfun is used as in Section 10 and the derivatives are estimated at the same point. The options structure is declared and initialized by e04xxc. Two options are set to suppress all printout from e04xac: is set to Nag_FALSE and . and e04xac is called. The returned function value and estimated derivative values are printed out and is reset to before e04xac is called again. On return, the computed function value and gradient, and estimated Hessian, are printed out.
A number of optional input and output arguments to e04xac are available through the structure argument options, type Nag_E04_Opt. An argument may be selected by assigning an appropriate value to the relevant structure member; those arguments not selected will be assigned default values. If no use is to be made of any of the optional parameters you should use the NAG defined null pointer, E04_DEFAULT, in place of options when calling e04xac; the default settings will then be used for all arguments.
Before assigning values to options directly the structure must be initialized by a call to the function e04xxc. Values may then be assigned to the structure members in the normal C manner.
After return from e04xac, the options structure may only be re-used for future calls of e04xac if the dimensions of the new problem are the same. Otherwise, the structure must be cleared by a call of e04xzc) and re-initialized by a call of e04xxc before future calls. Failure to do this will result in unpredictable behaviour.
Option settings may also be read from a text file using the function e04xyc in which case initialization of the options structure will be performed automatically if not already done. Any subsequent direct assignment to the options structure must not be preceded by initialization.
11.1Optional Parameter Checklist and Default Values
For easy reference, the following list shows the members of options which are valid for e04xac together with their default values where relevant. The number is a generic notation for machine precision (see X02AJC).
Boolean list
Nag_TRUE
Nag_DPrintType print_deriv
char outfile[512]
stdout
Nag_DWantType deriv_want
Boolean use_hfwd_init
Nag_FALSE
double f_prec
double f_prec_used
Integer nf
11.2Description of the Optional Parameters
list – Nag_Boolean
Default
On entry: if the argument settings in the call to e04xac will be printed.
print_deriv – Nag_DPrintType
Default
On entry: controls whether printout is produced by e04xac. The following values are available:
No output.
Printout for each variable as described in Section 5.
Constraint:
or .
outfile – const char[512]
Default
On entry: the name of the file to which results should be printed. If then the stdout stream is used.
deriv_want – Nag_DWantType
Default
On entry: specifies which derivatives e04xac should estimate. The following values are available:
Estimate the gradient and full Hessian, with you supplying the objective function via objfun.
Estimate the gradient and the Hessian diagonal values, with you supplying the objective function via objfun.
Estimate the full Hessian, with you supplying the objective function and gradients via objfun.
Constraint:
, or .
use_hfwd_init – Nag_Boolean
Default
On entry: if , then e04xac ignores any values supplied on entry in h_forward, and computes the initial trial intervals itself. If , then e04xac uses the forward difference interval provided in as the initial trial interval for computing the appropriate partial derivative to the th variable, ; however, if for some , the initial trial interval for the th variable is computed by e04xac.
f_prec – double
Default
On entry: specifies , which is intended to measure the accuracy with which the problem function can be computed. The value of should reflect the relative precision of , i.e., acts as a relative precision when is large, and as an absolute precision when is small. For example, if is typically of order 1000 and the first six significant figures are known to be correct, an appropriate value of would be . The default value of will be appropriate for most simple functions that are computed with full accuracy
A discussion of is given in Chapter 8 of Gill et al. (1981). If you provide a value of which e04xac determines to be either too small or too large, the default value will be used instead and a warning will be output if optional parameter . The value actually used is returned in .
Constraint:
.
f_prec_used – double
On exit: if or NW_DERIV_INFO, or if and , then contains the value of used by e04xac. If you supply a value for and e04xac considers that the value supplied is neither too large nor too small, then this value will be returned in ; otherwise will contain the default value, .
nf – double
On exit: the number of times the objective function has been evaluated (i.e., number of calls of objfun).
11.3Description of Printed Output
Results from e04xac are printed out by default. The level of printed output can be controlled with the structure members and (see Section 11.2). If then the argument values to e04xac are listed, whereas printout of results is governed by the value of .
The default, provides the following line of output for each variable.
j
the index of the variable for which the difference interval has been computed.
X(j)
the value of as provided in .
Fwd diff int
the best interval found for computing a forward-difference approximation to the appropriate partial derivative with respect to .
Cent diff int
the best interval found for computing a central-difference approximation to the appropriate partial derivative with respect to .
Error est
a bound on the estimated error in the final forward-difference approximation. When , Error est is set to zero.
Grad est
best estimate of the first partial derivative with respect to .
Hess diag est
best estimate of the second partial derivative with respect to .
Nfun
the number of function evaluations used to compute the final difference intervals for .
Info
gives diagnostic information for . Info will be one of OK, Constant?, Linear or odd?, Large 2nd deriv?, or Small 1st deriv?, corresponding to , , , or , respectively.