e04dg minimizes an unconstrained nonlinear function of several variables using a pre-conditioned, limited memory quasi-Newton conjugate gradient method. First derivatives (or an ‘acceptable’ finite difference approximation to them) are required. It is intended for use on large scale problems.

Syntax

C#
public static void e04dg( int n, E04..::..E04DG_OBJFUN objfun, out int iter, out double objf, double[] objgrd, double[] x, E04..::..e04dgOptions options, out int ifail )

Visual Basic
Public Shared Sub e04dg ( _ n As Integer, _ objfun As E04..::..E04DG_OBJFUN, _ <OutAttribute> ByRef iter As Integer, _ <OutAttribute> ByRef objf As Double, _ objgrd As Double(), _ x As Double(), _ options As E04..::..e04dgOptions, _ <OutAttribute> ByRef ifail As Integer _ )

Visual Basic

Public Shared Sub e04dg ( _
	n As Integer, _
	objfun As E04..::..E04DG_OBJFUN, _
	<OutAttribute> ByRef iter As Integer, _
	<OutAttribute> ByRef objf As Double, _
	objgrd As Double(), _
	x As Double(), _
	options As E04..::..e04dgOptions, _
	<OutAttribute> ByRef ifail As Integer _
)

Visual C++
public: static void e04dg( int n, E04..::..E04DG_OBJFUN^ objfun, [OutAttribute] int% iter, [OutAttribute] double% objf, array<double>^ objgrd, array<double>^ x, E04..::..e04dgOptions^ options, [OutAttribute] int% ifail )

Visual C++

public:
static void e04dg(
	int n, 
	E04..::..E04DG_OBJFUN^ objfun, 
	[OutAttribute] int% iter, 
	[OutAttribute] double% objf, 
	array<double>^ objgrd, 
	array<double>^ x, 
	E04..::..e04dgOptions^ options, 
	[OutAttribute] int% ifail
)

F#
static member e04dg : n : int * objfun : E04..::..E04DG_OBJFUN * iter : int byref * objf : float byref * objgrd : float[] * x : float[] * options : E04..::..e04dgOptions * ifail : int byref -> unit

Parameters

n: Type: System..::..Int32
On entry: $n$ , the number of variables.

Constraint: $n > 0$ .

objfun: Type: NagLibrary..::..E04..::..E04DG_OBJFUN
objfun must calculate the objective function $F (x)$ and possibly its gradient as well for a specified $n$ -element vector $x$ .
A delegate of type E04DG_OBJFUN.
Note: objfun should be tested separately before being used in conjunction with e04dg. See also the description of the optional parameter Verify.

iter: Type: System..::..Int32%
On exit: the total number of iterations performed.

objf: Type: System..::..Double%
On exit: the value of the objective function at the final iterate.

objgrd: Type: array<System..::..Double>[]()[][]
An array of size [n]
On exit: the gradient of the objective function at the final iterate (or its finite difference approximation).

x: Type: array<System..::..Double>[]()[][]
An array of size [n]
On entry: an initial estimate of the solution.
On exit: the final estimate of the solution.

options: Type: NagLibrary..::..E04..::..e04dgOptions
An Object of type E04.e04dgOptions. Used to configure optional parameters to this method.

ifail: Type: System..::..Int32%
On exit: $ifail = 0$ unless the method detects an error or a warning has been flagged (see [Error Indicators and Warnings]).

Description

e04dg is designed to solve unconstrained minimization problems of the form

\underset{x \in R^{n}}{minimize} F (x) subject to - \infty \leq x \leq \infty,

where

x

is an

n

-element vector.

You must supply an initial estimate of the solution.

The method used by e04dg is described in [Algorithmic Details].

References

Gill P E and Murray W (1979) Conjugate-gradient methods for large-scale nonlinear optimization Technical Report SOL 79-15 Department of Operations Research, Stanford University

Gill P E, Murray W and Wright M H (1981) Practical Optimization Academic Press

Error Indicators and Warnings

Note: e04dg may return useful information for one or more of the following detected errors or warnings.

Errors or warnings detected by the method:

$ifail < 0$: A negative value of ifail indicates an exit from e04dg because you set $mode < 0$ in objfun. The value of ifail will be the same as your setting of mode.

$ifail = 1$: Not used by this method.

$ifail = 2$: Not used by this method.

$ifail = 3$: The limiting number of iterations (as determined by the optional parameter Iteration Limit ( $default value = \max (50, 5 n)$ ) has been reached.
If the algorithm appears to be making satisfactory progress, then optional parameter Iteration Limit may be too small. If so, increase its value and rerun e04dg. If the algorithm seems to be making little or no progress, then you should check for incorrect gradients as described under $ifail = 7$ .

$ifail = 4$: The computed upper bound on the step length taken during the linesearch was too small. A rerun with an increased value of the optional parameter Maximum Step Length ( $ρ$ say) may be successful unless $ρ \geq 10^{20}$ (the default value), in which case the current point cannot be improved upon.

$ifail = 5$: Not used by this method.

$ifail = 6$: The conditions for an acceptable solution (see parameter ifail in [Parameters]) have not all been met, but a lower point could not be found.
If objfun computes the objective function and its gradient correctly, then this may occur because an overly stringent accuracy has been requested, i.e., the value of the optional parameter Optimality Tolerance ( $default value = ε^{0.8}$ ) is too small or if $α_{k} ≃ 0$ . In this case you should apply the three tests described under $ifail = 0$ to determine whether or not the final solution is acceptable. For a discussion of attainable accuracy see Gill et al. (1981).
If many iterations have occurred in which essentially no progress has been made or e04dg has failed to move from the initial point, objfun may be incorrect. You should refer to the comments below under $ifail = 7$ and check the gradients using the optional parameter Verify ( $default value = 0$ ). Unfortunately, there may be small errors in the objective gradients that cannot be detected by the verification process. Finite difference approximations to first derivatives are catastrophically affected by even small inaccuracies.

$ifail = 7$: The user-supplied derivatives of the objective function appear to be incorrect.
Large errors were found in the derivatives of the objective function. This value of ifail will occur if the verification process indicated that at least one gradient element had no correct figures. You should refer to the printed output to determine which elements are suspected to be in error.
As a first step, you should check that the code for the objective values is correct – for example, by computing the function at a point where the correct value is known. However, care should be taken that the chosen point fully tests the evaluation of the function. It is remarkable how often the values $x = 0$ or $x = 1$ are used to test function evaluation procedures, and how often the special properties of these numbers make the test meaningless.
Special care should be used in this test if computation of the objective function involves subsidiary data communicated in storage. Although the first evaluation of the function may be correct, subsequent calculations may be in error because some of the subsidiary data has accidentally been overwritten.
Errors in programming the function may be quite subtle in that the function value is almost correct. For example, the function may not be accurate to full precision because of the inaccurate calculation of a subsidiary quantity, or the limited accuracy of data upon which the function depends. A common error on machines where numerical calculations are usually performed in double precision is to include even one single precision constant in the calculation of the function; since some compilers do not convert such constants to double precision, half the correct figures may be lost by such a seemingly trivial error.

$ifail = 8$: The gradient $(g = \frac{\partial F}{\partial x})$ at the starting point $x_{0}$ is ‘too small’. More precisely, the value of $g {(x_{0})}^{T} g (x_{0})$ is less than $ε_{r} |1 + F (x_{0})|$ , where $ε_{r}$ is the value of the optional parameter Function Precision ( $default value = ε^{0.9}$ ).
The problem should be rerun from a different starting point.

$ifail = 9$: An input parameter is invalid.

$ifail = -9000$: An error occured, see message report.
$ifail = -8000$: Negative dimension for array $〈value〉$
$ifail = -6000$: Invalid Parameters $〈value〉$

Accuracy

On successful exit (

ifail = 0

) the accuracy of the solution will be as defined by the optional parameter Optimality Tolerance (

default value = ε^{0.8}

Parallelism and Performance

None.

Further Comments

To evaluate an ‘acceptable’ set of finite difference intervals using e04xa requires

2

function evaluations per variable for a well-scaled problem and up to

6

function evaluations per variable for a badly scaled problem.

Description of Printed Output

This section describes the intermediate printout and final printout produced by e04dg. You can control the level of printed output (see the description of the optional parameter Print Level). Note that the intermediate printout and final printout are produced only if

Print Level \geq 10

(the default ).

The following line of summary output (

< 80

characters) is produced at every iteration. In all cases, the values of the quantities are those in effect on completion of the given iteration.

Itn	is the iteration count.
Step	is the step $α_{k}$ taken along the computed search direction. On reasonably well-behaved problems, the unit step (i.e., $α_{k} = 1$ ) will be taken as the solution is approached.
Nfun	is the cumulated number of evaluations of the objective function needed for the linesearch. Evaluations needed for the verification of the gradients by finite differences are not included. Nfun is printed as a guide to the amount of work required for the linesearch. e04dg will perform at most $11$ function evaluations per iteration.
Objective	is the value of the objective function at $x_{k}$ .
Norm G	is the Euclidean norm of the gradient of the objective function at $x_{k}$ .
Norm X	is the Euclidean norm of $x_{k}$ .
Norm (X(k-1)-X(k))	is the Euclidean norm of $x_{k - 1} - x_{k}$ .

The following describes the printout for each variable.

Variable	gives the name (Varbl) and index $j$ , for $j = 1, 2, \dots, n$ of the variable.
Value	is the value of the variable at the final iteration.
Gradient Value	is the value of the gradient of the objective function with respect to the $j$ th variable at the final iteration.

Numerical values are output with a fixed number of digits; they are not guaranteed to be accurate to this precision.

Example

This example finds a minimum of the function

F = e^{x_{1}} (4 x_{1}^{2} + 2 x_{2}^{2} + 4 x_{1} x_{2} + 2 x_{2} + 1) .

The initial point is

x_{0} = {(- 1.0, 1.0)}^{T},

and

F (x_{0}) = 1.8394

(to five figures).

The optimal solution is

x^{*} = {(0.5, - 1.0)}^{T},

and

F (x^{*}) = 0

Example program (C#): e04dge.cs

Example program data: e04dge.d

Example program results: e04dge.r

Algorithmic Details

This section contains a description of the method used by e04dg.

e04dg uses a pre-conditioned conjugate gradient method and is based upon algorithm PLMA as described in Section 4.8.3 of Gill and Murray (1979) and Gill et al. (1981).

The algorithm proceeds as follows:

Let

x_{0}

be a given starting point and let

k

denote the current iteration, starting with

k = 0

. The iteration requires

g_{k}

, the gradient vector evaluated at

x_{k}

, the

k

th estimate of the minimum. At each iteration a vector

p_{k}

(known as the direction of search) is computed and the new estimate

x_{k + 1}

is given by

x_{k} + α_{k} p_{k}

where

α_{k}

(the step length) minimizes the function

F (x_{k} + α_{k} p_{k})

with respect to the scalar

α_{k}

. A choice of initial step

α_{0}

is taken as

α_{0} = \min \{1, 2 \times |F_{k} - F_{est}| / g_{k}^{T} g_{k}\}

where

F_{est}

is a user-supplied estimate of the function value at the solution. If

F_{est}

is not specified, the software always chooses the unit step length for

α_{0}

. Subsequent step length estimates are computed using cubic interpolation with safeguards.

A quasi-Newton method can be used to compute the search direction

p_{k}

by updating the inverse of the approximate Hessian

(H_{k})

and computing

p_{k + 1} = - H_{k + 1} g_{k + 1} .

(1)

The updating formula for the approximate inverse is given by

H_{k + 1} = H_{k} - \frac{1}{y_{k}^{T} s_{k}} (H_{k} y_{k} s_{k}^{T} + s_{k} y_{k}^{T} H_{k}) + \frac{1}{y_{k}^{T} s_{k}} (1 + \frac{y_{k}^{T} H_{k} y_{k}}{y_{k}^{T} s_{k}}) s_{k} s_{k}^{T},

(2)

where

y_{k} = g_{k - 1} - g_{k}

and

s_{k} = x_{k + 1} - x_{k} = α_{k} p_{k}

The method used to obtain the search direction is based upon computing

p_{k + 1}

- H_{k + 1} g_{k + 1}

where

H_{k + 1}

is a matrix obtained by updating the identity matrix with a limited number of quasi-Newton corrections. The storage of an

n

n

matrix is avoided by storing only the vectors that define the rank two corrections – hence the term ‘limited-memory’ quasi-Newton method. The precise method depends upon the number of updating vectors stored. For example, the direction obtained with the ‘one-step’ limited memory update is given by (1) using (2) with

H_{k}

equal to the identity matrix, viz.

p_{k + 1} = - g_{k + 1} + \frac{1}{y_{k}^{T} s_{k}} (s_{k}^{T} g_{k + 1} y_{k} + y_{k}^{T} g_{k + 1} s_{k}) - \frac{s_{k}^{T} g_{k + 1}}{y_{k}^{T} s_{k}} (1 + \frac{y_{k}^{T} y_{k}}{y_{k}^{T} s_{k}}) s_{k} .

Using a limited-memory quasi-Newton formula, such as the one above, guarantees

p_{k + 1}

to be a descent direction if all the inner products

y_{k}^{T} s_{k}

are positive for all vectors

y_{k}

and

s_{k}

used in the updating formula.

Syntax

Parameters

Description

References

Error Indicators and Warnings

Accuracy

Parallelism and Performance

Further Comments

Description of Printed Output

Example

Algorithmic Details

See Also