naginterfaces.library.opt.estimate_deriv¶
- naginterfaces.library.opt.estimate_deriv(msglvl, epsrf, x, mode, objfun, hforw, data=None, io_manager=None)[source]¶
estimate_deriv
computes an approximation to the gradient vector and/or the Hessian matrix for use in conjunction with, or following the use of an optimization function (such asnlp1_rcomm()
).For full information please refer to the NAG Library document for e04xa
https://support.nag.com/numeric/nl/nagdoc_30.3/flhtml/e04/e04xaf.html
- Parameters
- msglvlint
Must indicate the amount of intermediate output desired (see Further Comments for a description of the printed output). All output is written on the file object associated with the advisory I/O unit (see
FileObjManager
).Value
Definition
0
No printout
1
A summary is printed out for each variable plus any warning messages.
Other
Values other than and should normally be used only at the direction of NAG.
- epsrffloat
Must define , which is intended to be a measure of the accuracy with which the problem function can be computed. The value of should reflect the relative precision of , i.e., acts as a relative precision when is large, and as an absolute precision when is small. For example, if is typically of order and the first six significant digits are known to be correct, an appropriate value for would be .
A discussion of is given in Module 8 of Gill et al. (1981).
If is either too small or too large on entry a warning will be printed if , the argument set to the appropriate value on exit and
estimate_deriv
will use a default value of , where is the machine precision.If on entry, then
estimate_deriv
will use the default value internally.The default value will be appropriate for most simple functions that are computed with full accuracy.
- xfloat, array-like, shape
The point at which the derivatives are to be computed.
- modeint
Indicates which derivatives are required.
The gradient and Hessian diagonal values having supplied the objective function via .
The Hessian matrix having supplied both the objective function and gradients via .
The gradient values and Hessian matrix having supplied the objective function via .
- objfuncallable (objf, objgrd) = objfun(mode, x, nstate, data=None)
If or , must calculate the objective function; otherwise, if , must calculate the objective function and the gradients.
- Parameters
- modeint
indicates which argument values within need to be set.
To , is always set to the value that you set it to before the call to
estimate_deriv
.- xfloat, ndarray, shape
The point at which the objective function (and gradients if ) is to be evaluated.
- nstateint
Will be set to on the first call of by
estimate_deriv
, and is for all subsequent calls. Thus, if you wish, may be tested within in order to perform certain calculations once only. For example you may read data.- dataarbitrary, optional, modifiable in place
User-communication data for callback functions.
- Returns
- objffloat
Must be set to the value of the objective function.
- objgrdfloat, array-like, shape
If , must contain the value of the first derivative with respect to .
If , need not be set.
- hforwfloat, array-like, shape
The initial trial interval for computing the appropriate partial derivative to the th variable.
If , the initial trial interval is computed by
estimate_deriv
(see Notes).- dataarbitrary, optional
User-communication data for callback functions.
- io_managerFileObjManager, optional
Manager for I/O in this routine.
- Returns
- modeint
Is changed only if you set negative in , i.e., you have requested termination of
estimate_deriv
.- hforwfloat, ndarray, shape
is the best interval found for computing a forward-difference approximation to the appropriate partial derivative for the th variable.
- objffloat
The value of the objective function evaluated at the input vector in .
- objgrdfloat, ndarray, shape
If or , contains the best estimate of the first partial derivative for the th variable.
If , contains the first partial derivative for the th variable evaluated at the input vector in .
- hcntrlfloat, ndarray, shape
is the best interval found for computing a central-difference approximation to the appropriate partial derivative for the th variable.
- hfloat, ndarray, shape
If , the estimated Hessian diagonal elements are contained in the first column of this array.
If or , the estimated Hessian matrix is contained in the leading part of this array.
- iwarnint
on successful exit.
If the value of on entry is too small or too large then is set to or respectively on exit and the default value for is used within
estimate_deriv
.If then warnings will be printed if is too small or too large.
- infoint, ndarray, shape
represents diagnostic information on variable as follows:
The appropriate function appears to be constant. is set to the initial trial interval value (see Notes) corresponding to a well-scaled problem and Error est. in the printed output is set to zero. This value occurs when the estimated relative condition error in the first derivative approximation is unacceptably large for every value of the finite difference interval. If this happens when the function is not constant the initial interval may be too small; in this case, it may be worthwhile to rerun
estimate_deriv
with larger initial trial interval values supplied in (see Notes). This error may also occur if the function evaluation includes an inordinately large constant term or if is too large.The appropriate function appears to be linear or odd. is set to the smallest interval with acceptable bounds on the relative condition error in the forward - and backward-difference estimates. In this case, the estimated relative condition error in the second derivative approximation remained large for every trial interval, but the estimated error in the first derivative approximation was acceptable for at least one interval. If the function is not linear or odd the relative condition error in the second derivative may be decreasing very slowly, it may be worthwhile to rerun
estimate_deriv
with larger initial trial interval values supplied in (see Notes).The second derivative of the appropriate function appears to be so large that it cannot be reliably estimated (i.e., near a singularity). is set to the smallest trial interval.
This value occurs when the relative condition error estimate in the second derivative remained very small for every trial interval.
If the second derivative is not large the relative condition error in the second derivative may be increasing very slowly.
It may be worthwhile to rerun
estimate_deriv
with smaller initial trial interval values supplied in (see Notes).This error may also occur when the given value of is not a good estimate of a bound on the absolute error in the appropriate function (i.e., is too small).
The algorithm terminated with an apparently acceptable estimate of the second derivative. However the forward-difference estimates of the appropriate first derivatives (computed with the final estimate of the ‘optimal’ forward-difference interval) and the central difference estimates (computed with the interval used to compute the final estimate of the second derivative) do not agree to half a decimal place. The usual reason that the forward - and central-difference estimates fail to agree is that the first derivative is small.
If the first derivative is not small, it may be helpful to execute the procedure at a different point.
- Raises
- NagValueError
- (errno )
On entry, .
Constraint: .
- (errno )
On entry, .
Constraint: .
- Warns
- NagAlgorithmicWarning
- (errno )
One or more variables have a nonzero value.
- NagCallbackTerminateWarning
- (errno )
User requested termination by setting negative in .
- Notes
In the NAG Library the traditional C interface for this routine uses a different algorithmic base. Please contact NAG if you have any questions about compatibility.
estimate_deriv
is similar to routine FDCALC described in Gill et al. (1983a). It should be noted that this function aims to compute sufficiently accurate estimates of the derivatives for use with an optimization algorithm. If you require more accurate estimates you should refer to submodulenumdiff
.estimate_deriv
computes finite difference approximations to the gradient vector and the Hessian matrix for a given function. The simplest approximation involves the forward-difference formula, in which the derivative of a univariate function is approximated by the quantityfor some interval , where the subscript ‘F’ denotes ‘forward-difference’ (see Gill et al. (1983b)).
To summarise the procedure used by
estimate_deriv
(for the case when the objective function is available and you require estimates of gradient values and Hessian matrix diagonal values, i.e., ) consider a univariate function at the point . (In order to obtain the gradient of a multivariate function , where is an -vector, the procedure is applied to each component of , keeping the other components fixed.) Roughly speaking, the method is based on the fact that the bound on the relative truncation error in the forward-difference approximation tends to be an increasing function of , while the relative condition error bound is generally a decreasing function of , hence changes in will tend to have opposite effects on these errors (see Gill et al. (1983b)).The ‘best’ interval is given by
where is an estimate of , and is an estimate of the relative error associated with computing the function (see Module 8 of Gill et al. (1981)). Given an interval , is defined by the second-order approximation
The decision as to whether a given value of is acceptable involves , the following bound on the relative condition error in :
(When is zero, is taken as an arbitrary large number.)
The procedure selects the interval (to be used in computing ) from a sequence of trial intervals . The initial trial interval is taken as , where
unless you specify the initial value to be used.
The value of for a trial value is defined as ‘acceptable’ if it lies in the interval . In this case is taken as , and the current value of is used to compute from (1). If is unacceptable, the next trial interval is chosen so that the relative condition error bound will either decrease or increase, as required. If the bound on the relative condition error is too large, a larger interval is used as the next trial value in an attempt to reduce the condition error bound. On the other hand, if the relative condition error bound is too small, is reduced.
The procedure will fail to produce an acceptable value of in two situations. Firstly, if is extremely small, then may never become small, even for a very large value of the interval. Alternatively, may never exceed , even for a very small value of the interval. This usually implies that is extremely large, and occurs most often near a singularity.
As a check on the validity of the estimated first derivative, the procedure provides a comparison of the forward-difference approximation computed with (as above) and the central-difference approximation computed with . Using the central-difference formula the first derivative can be approximated by
where . If the values and do not display some agreement, neither can be considered reliable.
When both function and gradients are available and you require the Hessian matrix (i.e., )
estimate_deriv
follows a similar procedure to the case above with the exception that the gradient function is substituted for the objective function and so the forward-difference interval for the first derivative of with respect to variable is computed. The th column of the approximate Hessian matrix is then defined as in Module 2 of Gill et al. (1981), bywhere is the best forward-difference interval associated with the th component of and is the vector with unity in the th position and zeros elsewhere.
When only the objective function is available and you require the gradients and Hessian matrix (i.e., )
estimate_deriv
again follows the same procedure as the case for except that this time the value of for a trial value is defined as acceptable if it lies in the interval and the initial trial interval is taken asThe approximate Hessian matrix is then defined as in Module 2 of Gill et al. (1981), by
- References
Gill, P E, Murray, W, Saunders, M A and Wright, M H, 1983, Documentation for FDCALC and FDCORE, Technical Report SOL, 83–6, Stanford University
Gill, P E, Murray, W, Saunders, M A and Wright, M H, 1983, Computing forward-difference intervals for numerical optimization, SIAM J. Sci. Statist. Comput. (4), 310–321
Gill, P E, Murray, W and Wright, M H, 1981, Practical Optimization, Academic Press