e04lbc (bounds_mod_deriv2_comp) : NAG Library CL Interface, Mark 28

e04lbc is a comprehensive modified-Newton algorithm for finding:

–an unconstrained minimum of a function of several variables
–a minimum of a function of several variables subject to fixed upper and/or lower bounds on the variables.

First and second derivatives are required. e04lbc is intended for objective functions which have continuous first and second derivatives (although it will usually work even if the derivatives have occasional discontinuities).

The function may be called by the names: e04lbc, nag_opt_bounds_mod_deriv2_comp or nag_opt_bounds_2nd_deriv.

e04lbc is applicable to problems of the form:

\begin{matrix} Minimize & F (x_{1}, x_{2}, \dots, x_{n}) \\ subject to ​ & l_{j} \leq x_{j} \leq u_{j}, j = 1, 2, \dots, n . \end{matrix}

Special provision is made for unconstrained minimization (i.e., problems which actually have no bounds on the

x_{j}

), problems which have only non-negativity bounds, and problems in which

l_{1} = l_{2} = \dots = l_{n}

and

u_{1} = u_{2} = \dots = u_{n}

. It is possible to specify that a particular

x_{j}

should be held constant. You must supply a starting point, a function objfun to calculate the value of

F (x)

and its first derivatives

\frac{\partial F}{\partial x_{j}}

at any point

x

, and a function hessfun to calculate the second derivatives

\frac{\partial^{2} F}{\partial x_{i} \partial x_{j}}

.

A typical iteration starts at the current point

x

where

n_{z}

(say) variables are free from both their bounds. The vector of first derivatives of

F (x)

with respect to the free variables,

g_{z}

, and the matrix of second derivatives with respect to the free variables,

H

, are obtained. (These both have dimension

n_{z}

.) The equations

(H + E) p_{z} = - g_{z}

are solved to give a search direction

p_{z}

. (The matrix

E

is chosen so that

H + E

is positive definite.)

p_{z}

is then expanded to an

n

-vector

p

by the insertion of appropriate zero elements;

α

is found such that

F (x + α p)

is approximately a minimum (subject to the fixed bounds) with respect to

α

, and

x

is replaced by

x + α p

. (If a saddle point is found, a special search is carried out so as to move away from the saddle point.) If any variable actually reaches a bound, it is fixed and

n_{z}

is reduced for the next iteration.

There are two sets of convergence criteria – a weaker and a stronger. Whenever the weaker criteria are satisfied, the Lagrange-multipliers are estimated for all active constraints. If any Lagrange-multiplier estimate is significantly negative, then one of the variables associated with a negative Lagrange-multiplier estimate is released from its bound and the next search direction is computed in the extended subspace (i.e.,

n_{z}

is increased). Otherwise, minimization continues in the current subspace until the stronger criteria are satisfied. If at this point there are no negative or near-zero Lagrange-multiplier estimates, the process is terminated.

If you specify that the problem is unconstrained, e04lbc sets the

l_{j}

to

- 10^{10}

and the

u_{j}

to

10^{10}

. Thus, provided that the problem has been sensibly scaled, no bounds will be encountered during the minimization process and e04lbc will act as an unconstrained minimization algorithm.

Gill P E and Murray W (1973) Safeguarded steplength algorithms for optimization using descent methods NPL Report NAC 37 National Physical Laboratory

Gill P E and Murray W (1974) Newton-type methods for unconstrained and linearly constrained optimization Math. Programming 7 311–350

Gill P E and Murray W (1976) Minimization subject to bounds on the variables NPL Report NAC 72 National Physical Laboratory

A successful exit

(fail . code = NE_NOERROR)

is made from e04lbc when

H^{(k)}

is positive definite and when (B1, B2 and B3) or B4 hold, where

$B 1 \equiv α^{(k)} \times ‖ p^{(k)} ‖ < (options . optim_tol + \sqrt{ε}) \times (1.0 + ‖ x^{(k)} ‖)$
$B 2 \equiv | F^{(k)} - F^{(k - 1)} | < ({options . optim_tol}^{2} + ε) \times (1.0 + | F^{(k)} |)$
$B 3 \equiv ‖ g_{z}^{(k)} ‖ < (ε^{1 / 3} + options . optim_tol) \times (1.0 + | F^{(k)} |)$
$B 4 \equiv ‖ g_{z}^{(k)} ‖ < 0.01 \times \sqrt{ε}$ .

(Quantities with superscript

k

are the values at the

k

th iteration of the quantities mentioned in Section 3;

ε

is the machine precision,

.

denotes the Euclidean norm and

options . optim_tol

is described in Section 11.)

If

fail . code = NE_NOERROR

, then the vector in x on exit,

x_{sol}

, is almost certainly an estimate of the position of the minimum,

x_{true}

, to the accuracy specified by

options . optim_tol

.

If

fail . code = NW_COND_MIN

or NW_LAGRANGE_MULT_ZERO,

x_{sol}

may still be a good estimate of

x_{true}

, but the following checks should be made. Let the largest of the first

n_{z}

elements of the optional parameter

options . hesd

be

options . hesd [b]

, let the smallest be

options . hesd [s]

, and define

κ = options . hesd [b] / options . hesd [s]

. The scalar

κ

is usually a good estimate of the condition number of the projected Hessian matrix at

x_{sol}

. If

(a)the sequence ${F (x^{(k)})}$ converges to $F (x_{sol})$ at a superlinear or fast linear rate,
(b) ${‖ g_{z} (x_{sol}) ‖}^{2} < 10.0 \times ε$ , and
(c) $κ < 1.0 / ‖ g_{z} (x_{sol}) ‖$ ,

then it is almost certain that

x_{sol}

is a close approximation to the position of a minimum. When (b) is true, then usually

F (x_{sol})

is a close approximation to

F (x_{true})

. The quantities needed for these checks are all available in the results printout from e04lbc; in particular the final value of Cond H gives

κ

.

Further suggestions about confirmation of a computed solution are given in the E04 Chapter Introduction.

Background information to multithreading can be found in the Multithreading documentation.

e04lbc is not threaded in any implementation.

The number of iterations required depends on the number of variables, the behaviour of

F (x)

, the accuracy demanded and the distance of the starting point from the solution. The number of multiplications performed in an iteration of e04lbc is

n_{z}^{3} / 6 + O (n_{z}^{2})

. In addition, each iteration makes one call of hessfun and at least one call of objfun. So, unless

F (x)

and its derivatives can be evaluated very quickly, the run time will be dominated by the time spent in objfun.

Ideally, the problem should be scaled so that, at the solution,

F (x)

and the corresponding values of the

x_{j}

are each in the range

(−1, + 1)

, and so that at points one unit away from the solution,

F (x)

differs from its value at the solution by approximately one unit. This will usually imply that the Hessian matrix at the solution is well conditioned. It is unlikely that you will be able to follow these recommendations very closely, but it is worth trying (by guesswork), as sensible scaling will reduce the difficulty of the minimization problem, so that e04lbc will take less computer time.

If a problem is genuinely unconstrained and has been scaled sensibly, the following points apply:

(a) $n_{z}$ will always be $n$ ,
(b)the optional parameters $options . hesl$ and $options . hesd$ will be factors of the full approximate second derivative matrix with elements stored in the natural order,
(c)the elements of g should all be close to zero at the final point,
(d)the Status values given in the printout from e04lbc, and in the optional parameter $options . state$ on exit are unlikely to be of interest (unless they are negative, which would indicate that the modulus of one of the $x_{j}$ has reached $10^{10}$ for some reason),
(e)Norm g simply gives the norm of the first derivative vector.

This example minimizes the function

F = {(x_{1} + 10 x_{2})}^{2} + 5 {(x_{3} - x_{4})}^{2} + {(x_{2} - 2 x_{3})}^{4} + 10 {(x_{1} - x_{4})}^{4}

subject to the bounds

\begin{array}{l} 1 \leq x_{1} \leq 3 \\ −2 \leq x_{2} \leq 0 \\ 1 \leq x_{4} \leq 3 \end{array}

starting from the initial guess

{(1.46, - 0.82, 0.57, 1.21)}^{T}

.

The options structure is declared and initialized by e04xxc. One option value is read from a data file by use of e04xyc. The memory freeing function e04xzc is used to free the memory assigned to the pointers in the option structure. You must not use the standard C function free() for this purpose.

Program Text (e04lbce.c)

Program Options (e04lbce.opt)

Program Results (e04lbce.r)

A number of optional input and output arguments to e04lbc are available through the structure argument options, type Nag_E04_Opt. An argument may be selected by assigning an appropriate value to the relevant structure member; those arguments not selected will be assigned default values. If no use is to be made of any of the optional parameters you should use the NAG defined null pointer, E04_DEFAULT, in place of options when calling e04lbc; the default settings will then be used for all arguments.

Before assigning values to options directly the structure must be initialized by a call to the function e04xxc. Values may then be assigned to the structure members in the normal C manner.

After return from e04lbc, the options structure may only be re-used for future calls of e04lbc if the dimensions of the new problem are the same. Otherwise, the structure must be cleared by a call of e04xzc) and re-initialized by a call of e04xxc before future calls. Failure to do this will result in unpredictable behaviour.

Option settings may also be read from a text file using the function e04xyc in which case initialization of the options structure will be performed automatically if not already done. Any subsequent direct assignment to the options structure must not be preceded by initialization.

If assignment of functions and memory to pointers in the options structure is required, then this must be done directly in the calling program; they cannot be assigned using e04xyc.

For easy reference, the following list shows the members of options which are valid for e04lbc together with their default values where relevant. The number

ε

is a generic notation for machine precision (see X02AJC).

Boolean list	Nag_TRUE
Nag_PrintType print_level	$Nag_Soln_Iter$
char outfile[512]	stdout
void (*print_fun)()	NULL
Boolean deriv_check	Nag_TRUE
Integer max_iter	50n
double optim_tol	$10 \sqrt{ε}$
double linesearch_tol	$0.9$ ( $0.0$ if $n = 1$ )
double step_max	100000.0
Integer *state	size n
double *hesl	size $\max (n (n - 1) / 2, 1)$
double *hesd	size n
Integer iter
Integer nf

On entry: if

options . list = Nag_TRUE

the argument settings in the call to e04lbc will be printed.

On entry: the level of results printout produced by e04lbc. The following values are available:

$Nag_NoPrint$	No output.
$Nag_Soln$	The final solution.
$Nag_Iter$	One line of output for each iteration.
$Nag_Soln_Iter$	The final solution and one line of output for each iteration.
$Nag_Soln_Iter_Full$	The final solution and detailed printout at each iteration.

Details of each level of results printout are described in Section 11.3.

Constraint:

options . print_level = Nag_NoPrint

,

Nag_Soln

,

Nag_Iter

,

Nag_Soln_Iter

or

Nag_Soln_Iter_Full

.

On entry: the name of the file to which results should be printed. If

options . outfile [0] =' \0'

then the stdout stream is used.

On entry: printing function defined by you; the prototype of

options . print_fun

is

void (*print_fun)(const Nag_Search_State *st, Nag_COmm *comm);

See Section 11.3.1 below for further details.

On entry: if

options . deriv_check = Nag_TRUE

a check of the derivatives defined by objfun and hessfun will be made at the starting point x. A starting point of

x = 0

or

x = 1

should be avoided if this test is to be meaningful.

On entry: the limit on the number of iterations allowed before termination.

Constraint:

options . max_iter \geq 0

.

On entry: the accuracy in

x

to which the solution is required. If

x_{true}

is the true value of

x

at the minimum, then

x_{sol}

, the estimated position prior to a normal exit, is such that

$‖ x_{sol} - x_{true} ‖ < options . optim_tol \times (1.0 + ‖ x_{true} ‖)$ ,

where

‖ y ‖ = {(\sum_{j = 1}^{n} y_{j}^{2})}^{1 / 2}

. For example, if the elements of

x_{sol}

are not much larger than

1.0

in modulus and if

options . optim_tol

is set to

10^{−5}

, then

x_{sol}

is usually accurate to about five decimal places. (For further details see Section 9.) If the problem is scaled roughly as described in Section 9 and

ε

is the machine precision, then

\sqrt{ε}

is probably the smallest reasonable choice for

options . optim_tol

. (This is because, normally, to machine accuracy,

F (x + \sqrt{ε} e_{j}) = F (x)

where

e_{j}

is any column of the identity matrix.)

Constraint:

ε \leq options . optim_tol < 1.0

.

On entry: every iteration of e04lbc involves a linear minimization (i.e., minimization of

F (x + α p)

with respect to

α

).

options . linesearch_tol

specifies how accurately these linear minimizations are to be performed. The minimum with respect to

α

will be located more accurately for small values of

options . linesearch_tol

(say 0.01) than for large values (say 0.9).

Although accurate linear minimizations will generally reduce the number of iterations performed by e04lbc, they will increase the number of function evaluations required for each iteration. On balance, it is usually more efficient to perform a low accuracy linear minimization.

A smaller value such as

0.01

may be worthwhile:

(a)if objfun takes so little computer time that it is worth using extra calls of objfun to reduce the number of iterations and associated matrix calculations
(b)if calls to hessfun are expensive compared with calls to objfun.
(c)if $F (x)$ is a penalty or barrier function arising from a constrained minimization problem (since such problems are very difficult to solve).

If

n = 1

, the default for

options . linesearch_tol = 0.0

(if the problem is effectively one-dimensional then

options . linesearch_tol

should be set to

0.0

even though

n > 1

; i.e., if for all except one of the variables the lower and upper bounds are equal).

Constraint:

0.0 \leq options . linesearch_tol < 1.0

.

On entry: an estimate of the Euclidean distance between the solution and the starting point supplied by you. (For maximum efficiency a slight overestimate is preferable.) e04lbc will ensure that, for each iteration,

${(\sum_{j = 1}^{n} {[x_{j}^{(k)} - x_{j}^{(k - 1)}]}^{2})}^{1 / 2} \leq options . step_max$ ,

where

k

is the iteration number. Thus, if the problem has more than one solution, e04lbc is most likely to find the one nearest the starting point. On difficult problems, a realistic choice can prevent the sequence of

x^{(k)}

entering a region where the problem is ill-behaved and can also help to avoid possible overflow in the evaluation of

F (x)

. However, an underestimate of

options . step_max

can lead to inefficiency.

Constraint:

options . step_max \geq options . optim_tol

.

On exit:

options . state

contains information about which variables are on their bounds and which are free at the final point given in x. If

x_{j}

is:

(a)fixed on its upper bound, $options . state [j - 1]$ is $−1$ ;
(b)fixed on its lower bound, $options . state [j - 1]$ is $−2$ ;
(c)effectively a constant (i.e., $l_{j} = u_{j}$ ), $options . state [j - 1]$ is $−3$ ;
(d)free, $options . state [j - 1]$ gives its position in the sequence of free variables.

On exit: during the determination of a direction

p_{z}

(see Section 3),

H + E

is decomposed into the product

{L D L}^{T}

, where

L

is a unit lower triangular matrix and D is a diagonal matrix. (The matrices

H

,

E

,

L

and

D

are all of dimension

n_{z}

, where

n_{z}

is the number of variables free from their bounds.

H

consists of those rows and columns of the full second derivative matrix which relate to free variables.

E

is chosen so that

H + E

is positive definite.)

options . hesl

and

options . hesd

are used to store the factors

L

and

D

. The elements of the strict lower triangle of

L

are stored row by row in the first

n_{z} (n_{z} - 1) / 2

positions of

options . hesl

. The diagonal elements of

D

are stored in the first

n_{z}

positions of

options . hesd

.

In the last factorization before a normal exit, the matrix

E

will be zero, so that

options . hesl

and

options . hesd

will contain, on exit, the factors of the final second derivative matrix

H

. The elements of

options . hesd

are useful for deciding whether to accept the result produced by e04lbc (see Section 9).

On exit: the number of iterations which have been performed in e04lbc.

On exit: the number of times the residuals have been evaluated (i.e., number of calls of objfun).

The level of printed output can be controlled with the structure members

options . list

and

options . print_level

(see Section 11.2). If

options . list = Nag_TRUE

then the argument values to e04lbc are listed, whereas the printout of results is governed by the value of

options . print_level

. The default of

options . print_level = Nag_Soln_Iter

provides a single line of output at each iteration and the final result. This section describes all of the possible levels of results printout available from e04lbc.

When

options . print_level = Nag_Iter

or

Nag_Soln_Iter

the following line of output is produced on completion of each iteration.

Itn	the iteration count, $k$ .
Nfun	the cumulative number of calls made to objfun.
Objective	the value of the objective function, $F (x^{(k)})$
Norm g	the Euclidean norm of the projected gradient vector, $‖ g_{z} (x^{(k)}) ‖$ .
Norm x	the Euclidean norm of $x^{(k)}$ .
Norm(x(k-1)-x(k))	the Euclidean norm of $x^{(k - 1)} - x^{(k)}$ .
Step	the step $α^{(k)}$ taken along the computed search direction $p^{(k)}$ .
Cond H	the ratio of the largest to the smallest element of the diagonal factor $D$ of the projected Hessian matrix. This quantity is usually a good estimate of the condition number of the projected Hessian matrix. (If no variables are currently free, this value will be zero.)
PosDef	indicates whether the second derivative matrix for the current subspace, $H$ , is positive definite (Yes) or not (No).

When

options . print_level = Nag_Soln

,

Nag_Soln_Iter

or

Nag_Soln_Full

this single line of output is also produced for the final solution.

When

options . print_level = Nag_Soln_Iter_Full

more detailed results are given at each iteration. Additional values output are

x	the current point $x^{(k)}$ .
g	the current projected gradient vector, $g_{z} (x^{(k)})$ .
Status	the current state of the variable with respect to its bound(s).

If

options . print_level = Nag_Soln

,

Nag_Soln_Iter

or

Nag_Soln_Iter_Full

the final result is printed out. This consists of:

x	the final point, $x^{*}$ .
g	the final projected gradient vector, $g_{z} (x^{*})$ .
Status	the final state of the variable with respect to its bound(s).

If

options . print_level = Nag_NoPrint

then printout will be suppressed; you can print the final solution when e04lbc returns to the calling program.

You may also specify your own print function for output of iteration results and the final solution by use of the

options . print_fun

function pointer, which has prototype

void (*print_fun)(const Nag_Search_State *st, Nag_Comm *comm);

The rest of this section can be skipped if the default printing facilities provide the required functionality.

When a user-defined function is assigned to

options . print_fun

this will be called in preference to the internal print function of e04lbc. Calls to the user-defined function are again controlled by means of the

options . print_level

member. Information is provided through st and comm, the two structure arguments to

options . print_fun

.

If

comm \to it_prt = Nag_TRUE

then the results on completion of an iteration of e04lbc are contained in the members of st. If

comm \to sol_prt = Nag_TRUE

then the final results from e04lbc, including details of the final iteration, are contained in the members of st. In both cases, the same members of st are set, as follows:

The relevant members of the structure comm are:

NAG CL Interface
e04lbc (bounds_mod_deriv2_comp)

▸▿ Contents

1 Purpose

2 Specification

3 Description

4 References

5 Arguments

6 Error Indicators and Warnings

7 Accuracy

8 Parallelism and Performance

9 Further Comments

9.1 Timing

9.2 Scaling

9.3 Unconstrained Minimization

10 Example

10.1 Program Text

10.2 Program Data

10.3 Program Results

11 Optional Parameters

11.1 Optional Parameter Checklist and Default Values

11.2 Description of the Optional Parameters

11.3 Description of Printed Output

11.3.1 Output of results via a user-defined printing function

NAG CL Interfacee04lbc (bounds_​mod_​deriv2_​comp)

▸▿ Contents

1 Purpose

2 Specification

3 Description

4 References

5 Arguments

6 Error Indicators and Warnings

7 Accuracy

8 Parallelism and Performance

9 Further Comments

9.1 Timing

9.2 Scaling

9.3 Unconstrained Minimization

10 Example

10.1 Program Text

10.2 Program Data

10.3 Program Results

11 Optional Parameters

11.1 Optional Parameter Checklist and Default Values

11.2 Description of the Optional Parameters

11.3 Description of Printed Output

11.3.1 Output of results via a user-defined printing function

NAG CL Interface
e04lbc (bounds_mod_deriv2_comp)