e04dgc minimizes an unconstrained nonlinear function of several variables using a pre-conditioned, limited memory quasi-Newton conjugate gradient method. The function is intended for use on large scale problems.
The function may be called by the names: e04dgc, nag_opt_uncon_conjgrd_comp or nag_opt_conj_grad.
3Description
e04dgc uses a pre-conditioned conjugate gradient method and is based upon algorithm PLMA as described in Gill and Murray (1979) and Section 4.8.3 of Gill et al. (1981).
The algorithm proceeds as follows:
Let be a given starting point and let denote the current iteration, starting with . The iteration requires , the gradient vector evaluated at , the th estimate of the minimum. At each iteration a vector (known as the direction of search) is computed and the new estimate is given by where (the step length) minimizes the function with respect to the scalar . At the start of each line search an initial approximation to the step is taken as:
where is a user-supplied estimate of the function value at the solution. If is not specified, the software always chooses the unit step length for . Subsequent step length estimates are computed using cubic interpolation with safeguards.
A quasi-Newton method computes the search direction, , by updating the inverse of the approximate Hessian and computing
(1)
The updating formula for the approximate inverse is given by
(2)
where and .
The method used by e04dgc to obtain the search direction is based upon computing as where is a matrix obtained by updating the identity matrix with a limited number of quasi-Newton corrections. The storage of an matrix is avoided by storing only the vectors that define the rank two corrections – hence the term limited-memory quasi-Newton method. The precise method depends upon the number of updating vectors stored. For example, the direction obtained with the ‘one-step’ limited memory update is given by (1) using (2) with equal to the identity matrix, viz.
e04dgc uses a two-step method described in detail in Gill and Murray (1979) in which restarts and pre-conditioning are incorporated. Using a limited-memory quasi-Newton formula, such as the one above, guarantees to be a descent direction if all the inner products are positive for all vectors and used in the updating formula.
The termination criteria of e04dgc are as follows:
Let specify an argument that indicates the number of correct figures desired in ( is equivalent to in the optional parameter list, see Section 11). If the following three conditions are satisfied:
(i)
(ii)
(iii) or , where is the absolute error associated with computing the objective function
then the algorithm is considered to have converged. For a full discussion on termination criteria see Chapter 8 of Gill et al. (1981).
4References
Gill P E and Murray W (1979) Conjugate-gradient methods for large-scale nonlinear optimization Technical Report SOL 79-15 Department of Operations Research, Stanford University
Gill P E, Murray W, Saunders M A and Wright M H (1983) Computing forward-difference intervals for numerical optimization SIAM J. Sci. Statist. Comput.4 310–321
Gill P E, Murray W and Wright M H (1981) Practical Optimization Academic Press
5Arguments
1: – IntegerInput
On entry: the number of variables.
Constraint:
.
2: – function, supplied by the userExternal Function
objfun must calculate the objective function and its gradient at a specified point .
On entry: the point at which the objective function is required.
3: – double *Output
On exit: the value of the objective function at the current point .
4: – doubleOutput
On exit: must contain the value of at the point , for .
5: – Nag_Comm *
Pointer to structure of type ; the following members are relevant to objfun.
flag – IntegerInput/Output
On entry: is always non-negative.
On exit: if objfun resets to some negative number then e04dgc will terminate immediately with the error indicator NE_USER_STOP. If fail is supplied to e04dgc will be set to your setting of .
first – Nag_BooleanInput
On entry: will be set to Nag_TRUE on the first call to objfun and Nag_FALSE for all subsequent calls.
nf – IntegerInput
On entry: the number of calculations of the objective function; this value will be equal to the number of calls made to objfun including the current one.
user – double *
iuser – Integer *
p – Pointer
The type Pointer will be void * with a C compiler that defines void * and char * otherwise. Before calling e04dgc these pointers may be allocated memory and initialized with various quantities for use by objfun when called from e04dgc.
Note:objfun should not return floating-point NaN (Not a Number) or infinity values, since these are not handled by e04dgc. If your code inadvertently does return any NaNs or infinities, e04dgc is likely to produce unexpected results.
Note:objfun should be tested separately before being used in conjunction with e04dgc. The array x must not be changed by objfun.
3: – doubleInput/Output
On entry: , an estimate of the solution point .
On exit: the final estimate of the solution.
4: – double *Output
On exit: the value of the objective function at the final iterate.
5: – doubleOutput
On exit: the objective gradient at the final iterate.
6: – Nag_E04_Opt *Input/Output
On entry/exit: a pointer to a structure of type whose members are optional parameters for e04dgc. These structure members offer the means of adjusting some of the argument values of the algorithm and on output will supply further details of the results. A description of the members of options is given below in Section 11.
If any of these optional parameters are required then the structure options should be declared and initialized by a call to e04xxc and supplied as an argument to e04dgc. However, if the optional parameters are not required the NAG defined null pointer, E04_DEFAULT, can be used in the function call.
7: – Nag_Comm *Input/Output
Note:comm is a NAG defined type (see Section 3.1.1 in the Introduction to the NAG Library CL Interface).
On entry/exit: structure containing pointers for communication with user-supplied functions; see the above description of objfun for details. If you do not need to make use of this communication feature the null pointer NAGCOMM_NULL may be used in the call to e04dgc; comm will then be declared internally for use in calls to user-supplied functions.
8: – NagError *Input/Output
The NAG error argument (see Section 7 in the Introduction to the NAG Library CL Interface).
6Error Indicators and Warnings
NE_ALLOC_FAIL
Dynamic memory allocation failed.
NE_BAD_PARAM
On entry, argument had an illegal value.
On entry, argument had an illegal value.
NE_DERIV_ERRORS
Large errors were found in the derivatives of the objective function.
This value of fail will occur if the verification process indicated that at least one gradient component had no correct figures. You should refer to the printed output to determine which elements are suspected to be in error.
As a first step, you should check that the code for the objective values is correct – for example, by computing the function at a point where the correct value is known. However, care should be taken that the chosen point fully tests the evaluation of the function. It is remarkable how often the values or are used to test function evaluation procedures, and how often the special properties of these numbers make the test meaningless.
Errors in programming the function may be quite subtle in that the function value is ‘almost’ correct. For example, the function may not be accurate to full precision because of the inaccurate calculation of a subsidiary quantity, or the limited accuracy of data upon which the function depends.
NE_GRAD_TOO_SMALL
The gradient at the starting point is too small, rerun the problem at a different starting point.
The value of is less than , where is the machine precision.
NE_INT_ARG_LT
On entry, .
Constraint: .
NE_INVALID_INT_RANGE_1
Value given to not valid. Correct range is .
NE_INVALID_REAL_RANGE_EF
Value given to not valid. Correct range is .
Value given to not valid. Correct range is .
NE_INVALID_REAL_RANGE_F
Value given to not valid. Correct range is .
NE_INVALID_REAL_RANGE_FF
Value given to not valid. Correct range is .
NE_NOT_APPEND_FILE
Cannot open file for appending.
NE_NOT_CLOSE_FILE
Cannot close file .
NE_OPT_NOT_INIT
Options structure not initialized.
NE_USER_STOP
User requested termination, user flag value .
This exit occurs if you set to a negative value in objfun. If fail is supplied the value of will be the same as your setting of .
NE_WRITE_ERROR
Error occurred when writing to file .
NW_NO_IMPROVEMENT
A sufficient decrease in the function value could not be attained during the final linesearch. Current point cannot be improved upon.
If objfun computes the function and gradients correctly, then this warning may occur because an overly stringent accuracy has been requested, i.e., is too small or if the minimum lies close to a step length of zero. In this case you should apply the tests described in Section 3 to determine whether or not the final solution is acceptable. For a discussion of attainable accuracy see Gill et al. (1981).
If many iterations have occurred in which essentially no progress has been made or e04dgc has failed to move from the initial point, then the function objfun may be incorrect. You should refer to the comments below under NE_DERIV_ERRORS and check the gradients using the argument. Unfortunately, there may be small errors in the objective gradients that cannot be detected by the verification process. Finite difference approximations to first derivatives are catastrophically affected by even small inaccuracies.
NW_STEP_BOUND_TOO_SMALL
Computed upper-bound on step length was too small
The computed upper bound on the step length taken during the linesearch was too small. A rerun with an increased value of ( say) may be successful unless (the default value), in which case the current point cannot be improved upon.
NW_TOO_MANY_ITER
The maximum number of iterations, , have been performed.
If the algorithm appears to be making progress the value of value may be too small (see Section 11), you should increase its value and rerun e04dgc. If the algorithm seems to be ‘bogged down’, you should check for incorrect gradients or ill-conditioning as described below under NW_NO_IMPROVEMENT.
7Accuracy
On successful exit the accuracy of the solution will be as defined by the optional parameter .
8Parallelism and Performance
e04dgc is not threaded in any implementation.
9Further Comments
9.1Timing
Problems whose Hessian matrices at the solution contain sets of clustered eigenvalues are likely to be minimized in significantly fewer than iterations. Problems without this property may require anything between and iterations, with approximately iterations being a common figure for moderately difficult problems.
10Example
This example minimizes the function
The data includes a set of user-defined column and row names, and data for the Hessian in a sparse storage format (see Section 10.2 for further details). The options structure is declared and initialized by e04xxc. Five option values are read from a data file by use of e04xyc.
A number of optional input and output arguments to e04dgc are available through the structure argument options, type Nag_E04_Opt. An argument may be selected by assigning an appropriate value to the relevant structure member; those arguments not selected will be assigned default values. If no use is to be made of any of the optional parameters you should use the NAG defined null pointer, E04_DEFAULT, in place of options when calling e04dgc; the default settings will then be used for all arguments.
Before assigning values to options directly the structure must be initialized by a call to the function e04xxc. Values may then be assigned to the structure members in the normal C manner.
After return from e04dgc, the options structure may only be re-used for future calls of e04dgc if the dimensions of the new problem are the same. Otherwise, the structure must be cleared by a call of e04xzc) and re-initialized by a call of e04xxc before future calls. Failure to do this will result in unpredictable behaviour.
Option settings may also be read from a text file using the function e04xyc in which case initialization of the options structure will be performed automatically if not already done. Any subsequent direct assignment to the options structure must not be preceded by initialization.
If assignment of functions and memory to pointers in the options structure is required, then this must be done directly in the calling program, they cannot be assigned using e04xyc.
11.1Optional Parameter Checklist and Default Values
For easy reference, the following list shows the members of options which are valid for e04dgc together with their default values where relevant. The number is a generic notation for machine precision (see X02AJC).
On entry: specifies the level of derivative checking to be performed by e04dgc on the gradient elements defined in objfun.
may have the following values:
No derivative check is performed.
Perform a simple check of the gradient.
Perform a component check of the gradient elements.
If then a simple ‘cheap’ test is performed, which requires only one call to objfun. If then a more reliable (but more expensive) test will be made on individual gradient components. This component check will be made in the range specified by and , default values being and n respectively. The procedure for the derivative check is based on finding an interval that produces an acceptable estimate of the second derivative, and then using that estimate to compute an interval that should produce a reasonable forward-difference approximation. The gradient element is then compared with the difference approximation. (The method of finite difference interval estimation is based on Gill et al. (1983)). The result of the test is printed out by e04dgc if .
Constraint:
, or .
print_gcheck – Nag_Boolean
Default
On entry: if Nag_TRUE the result of any derivative check (see ) will be printed.
obj_check_start – Integer
Default
obj_check_stop – Integer
Default
On entry: these options take effect only when . They may be used to control the verification of gradient elements computed by the function objfun. For example, if the first 30 variables appear linearly in the objective, so that the corresponding gradient elements are constant, then it is reasonable for to be set to 31.
Constraint:
.
max_iter – Integer
Default
On entry: the limit on the number of iterations allowed before termination.
Constraint:
.
f_prec – double
Default
On entry: this argument defines , which is intended to be a measure of the accuracy with which the problem function can be computed. The value of should reflect the relative precision of ; i.e., acts as a relative precision when is large, and as an absolute precision when is small. For example, if is typically of order 1000 and the first six significant digits are known to be correct, an appropriate value for would be . In contrast, if is typically of order and the first six significant digits are known to be correct, an appropriate value for would be . The choice of can be quite complicated for badly scaled problems; see Chapter 8 of Gill et al. (1981), for a discussion of scaling techniques. The default value is appropriate for most simple functions that are computed with full accuracy. However when the accuracy of the computed function values is known to be significantly worse than full precision, the value of should be large enough so that e04dgc will not attempt to distinguish between function values that differ by less than the error inherent in the calculation.
Constraint:
.
optim_tol – double
Default
On entry: specifies the accuracy to which you wish the final iterate to approximate a solution of the problem. Broadly speaking, indicates the number of correct figures desired in the objective function at the solution. For example, if is and e04dgc terminates successfully, the final value of should have approximately six correct figures. e04dgc will terminate successfully if the iterative sequence of -values is judged to have converged and the final point satisfies the termination criteria (see Section 3, where represents ).
Constraint:
.
linesearch_tol – double
Default
On entry: controls the accuracy with which the step taken during each iteration approximates a minimum of the function along the search direction (the smaller the value of , the more accurate the linesearch). The default value requests an inaccurate search, and is appropriate for most problems. A more accurate search may be appropriate when it is desirable to reduce the number of iterations – for example, if the objective function is cheap to evaluate.
Constraint:
.
max_line_step – double
Default
On entry: defines the maximum allowable step length for the line search.
Constraint:
.
f_est – double
On entry: specifies the user-supplied guess of the optimum objective function value. This value is used by e04dgc to calculate an initial step length (see Section 3). If no value is supplied then an initial step length of will be used but it should be noted that for badly scaled functions a unit step along the steepest descent direction will often compute the function at very large values of .
iter – Integer
On exit: the number of iterations which have been performed in e04dgc.
nf – Integer
On exit: the number of times the objective function has been evaluated (i.e., number of calls of objfun). The total excludes the calls made to objfun for purposes of derivative checking.
11.3Description of Printed Output
The level of printed output can be controlled with the structure members , and (see Section 11.2). If then the argument values to e04dgc are listed, followed by the result of any derivative check if . The printout of the optimization results is governed by the value of . The default of provides a single line of output at each iteration and the final result. This section describes all of the possible levels of results printout available from e04dgc.
If a simple derivative check, , is requested then the directional derivative, , of the objective gradient and its finite difference approximation are printed out, where is a random vector of unit length.
When a component derivative check, , is requested then the following results are supplied for each component:
x[i]
the element of .
dx[i]
the optimal finite difference interval.
g[i]
the gradient element.
Difference approxn.
the finite difference approximation.
Itns
the number of trials performed to find a suitable difference interval.
The indicator, OK or BAD?, states whether the gradient element and finite difference approximation are in agreement.
If the gradient is believed to be in error e04dgc will exit with fail set to NE_DERIV_ERRORS.
When or a single line of output is produced on completion of each iteration, this gives the following values:
Itn
the current iteration number .
Nfun
the cumulative number of calls to objfun. The evaluations needed for the estimation of the gradients by finite differences are not included in the total Nfun. The value of Nfun is a guide to the amount of work required for the linesearch. e04dgc will perform at most 16 function evaluations per iteration.
Objective
the current value of the objective function, .
Norm g
the Euclidean norm of the gradient vector, .
Norm x
the Euclidean norm of .
Norm(x(k-1)-x(k))
the Euclidean norm of .
Step
the step taken along the computed search direction .
If or , the final result is printed out. This consists of:
x
the final point, .
g
the final gradient vector, .
If then printout will be suppressed; you can print the final solution when e04dgc returns to the calling program.
11.3.1Output of results via a user-defined printing function
You may also specify your own print function for output of the results of any gradient check, the optimization results at each iteration and the final solution. The user-defined print function should be assigned to the function pointer, which has prototype
The rest of this section can be skipped if the default printing facilities provide the required functionality.
When a user-defined function is assigned to this will be called in preference to the internal print function of e04dgc. Calls to the user-defined function are again controlled by means of the and members. Information is provided through st and comm the two structure arguments to .
If then the results from the last iteration of e04dgc are in the following members of st:
n – Integer
The number of variables.
x – double *
Points to the memory locations holding the current point .
f – double
The value of the current objective function.
g – double *
Points to the memory locations holding the first derivatives of at the current point .
Points to the memory locations holding the initial point .
g – double *
Points to the memory locations holding the first derivatives of at the initial point .
Details of any derivative check performed by e04dgc are held in the following substructure of st:
gprint – Nag_GPrintSt *
Which in turn contains two substructures , and a pointer to an array of substructures, .
g_chk – Nag_Grad_Chk_St *
This substructure contains the members:
type – Nag_GradChk
The type of derivative check performed by e04dgc. This will be the same value as in .
g_error – Integer
This member will be equal to one of the error codes NE_NOERROR or NE_DERIV_ERRORS according to whether the derivatives were found to be correct or not.
obj_start – Integer
Specifies the gradient element at which any component check started. This value will be equal to .
obj_stop – Integer
Specifies the gradient element at which any component check ended. This value will be equal to .
f_sim – Nag_SimSt *
The result of a simple derivative check, , will be held in this substructure which has members:
correct – Nag_Boolean
If Nag_TRUE then the objective gradient is consistent with the finite difference approximation according to a simple check.
dir_deriv – double *
The directional derivative where is a random vector of unit length with elements of approximately equal magnitude.
fd_approx – double *
The finite difference approximation, , to the directional derivative.
f_comp – Nag_CompSt *
The results of a component derivative check, , will be held in the array of substructures of type pointed to by . The procedure for the derivative check is based on finding an interval that produces an acceptable estimate of the second derivative, and then using that estimate to compute an interval that should produce a reasonable forward-difference approximation. The gradient element is then compared with the difference approximation. (The method of finite difference interval estimation is based on Gill et al. (1983)).
correct – Nag_Boolean
If Nag_TRUE then this objective gradient component is consistent with its finite difference approximation.
hopt – double *
The optimal finite difference interval. This is dx[i] in the NAG default printout.
gdiff – double *
The finite difference approximation for this gradient component.
iter – Integer
The number of trials performed to find a suitable difference interval.
comment – char
A character string which describes the possible nature of the reason for which an estimation of the finite difference interval failed to produce a satisfactory relative condition error of the second-order difference. Possible strings are: "Constant?", "Linear or odd?", "Too nonlinear?" and "Small derivative?".
The relevant members of the structure comm are:
g_prt – Nag_Boolean
Will be Nag_TRUE only when the print function is called with the result of the derivative check of objfun.
it_prt – Nag_Boolean
Will be Nag_TRUE when the print function is called with the result of the current iteration.
sol_prt – Nag_Boolean
Will be Nag_TRUE when the print function is called with the final result.
user – double *
iuser – Integer *
p – Pointer
Pointers for communication of user information. If used they must be allocated memory either before entry to e04dgc or during a call to objfun or . The type Pointer will be void * with a C compiler that defines void * and char * otherwise.