e04uf is designed to minimize an arbitrary smooth function subject to constraints (which may include simple bounds on the variables, linear constraints and smooth nonlinear constraints) using a sequential quadratic programming (SQP) method. You should supply as many first derivatives as possible; any unspecified derivatives are approximated by finite differences. It is not intended for large sparse problems.
e04uf may also be used for unconstrained, bound-constrained and linearly constrained optimization.
e04uf uses reverse communication for evaluating the objective function, the nonlinear constraint functions and any of their derivatives.
Syntax
C# |
---|
public static void e04uf( ref int irevcm, int n, int nclin, int ncnln, double[,] a, double[] bl, double[] bu, ref int iter, int[] istate, double[] c, double[,] cjac, double[] clamda, ref double objf, double[] objgrd, double[,] r, double[] x, int[] needc, int[] iwork, double[] work, E04..::..e04ufOptions options, out int ifail ) |
Visual Basic |
---|
Public Shared Sub e04uf ( _ ByRef irevcm As Integer, _ n As Integer, _ nclin As Integer, _ ncnln As Integer, _ a As Double(,), _ bl As Double(), _ bu As Double(), _ ByRef iter As Integer, _ istate As Integer(), _ c As Double(), _ cjac As Double(,), _ clamda As Double(), _ ByRef objf As Double, _ objgrd As Double(), _ r As Double(,), _ x As Double(), _ needc As Integer(), _ iwork As Integer(), _ work As Double(), _ options As E04..::..e04ufOptions, _ <OutAttribute> ByRef ifail As Integer _ ) |
Visual C++ |
---|
public: static void e04uf( int% irevcm, int n, int nclin, int ncnln, array<double,2>^ a, array<double>^ bl, array<double>^ bu, int% iter, array<int>^ istate, array<double>^ c, array<double,2>^ cjac, array<double>^ clamda, double% objf, array<double>^ objgrd, array<double,2>^ r, array<double>^ x, array<int>^ needc, array<int>^ iwork, array<double>^ work, E04..::..e04ufOptions^ options, [OutAttribute] int% ifail ) |
F# |
---|
static member e04uf : irevcm : int byref * n : int * nclin : int * ncnln : int * a : float[,] * bl : float[] * bu : float[] * iter : int byref * istate : int[] * c : float[] * cjac : float[,] * clamda : float[] * objf : float byref * objgrd : float[] * r : float[,] * x : float[] * needc : int[] * iwork : int[] * work : float[] * options : E04..::..e04ufOptions * ifail : int byref -> unit |
Parameters
- irevcm
- Type: System..::..Int32%On initial entry: must be set to .On intermediate exit: specifies what values the calling program must assign to parameters of e04uf before re-entering the method.On intermediate re-entry: must remain unchanged, unless you wish to terminate the solution to the current problem. In this case irevcm may be set to a negative value and then e04uf will take a final exit with ifail set to this value of irevcm.On final exit: .Constraint: .
- n
- Type: System..::..Int32On initial entry: , the number of variables.Constraint: .
- nclin
- Type: System..::..Int32On initial entry: , the number of general linear constraints.Constraint: .
- ncnln
- Type: System..::..Int32On initial entry: , the number of nonlinear constraints.Constraint: .
- a
- Type: array<System..::..Double,2>[,](,)[,][,]An array of size [dim1, dim2]Note: dim1 must satisfy the constraint:Note: the second dimension of the array a must be at least if , and at least otherwise.On initial entry: the th row of the matrix of general linear constraints in (1) must be stored in , for and . That is, the th row contains the coefficients of the th general linear constraint, for .If , the array a is not referenced.
- bl
- Type: array<System..::..Double>[]()[][]An array of size []On initial entry: bl must contain the lower bounds and bu the upper bounds, for all the constraints in the following order. The first elements of each array must contain the bounds on the variables, the next elements the bounds for the general linear constraints (if any) and the next elements the bounds for the general nonlinear constraints (if any). To specify a nonexistent lower bound (i.e., ), set , and to specify a nonexistent upper bound (i.e., ), set ; the default value of is , but this may be changed by the optional parameter Infinite Bound Size. To specify the th constraint as an equality, set , say, where .Constraints:
- , for ;
- if , .
- bu
- Type: array<System..::..Double>[]()[][]An array of size []On initial entry: bl must contain the lower bounds and bu the upper bounds, for all the constraints in the following order. The first elements of each array must contain the bounds on the variables, the next elements the bounds for the general linear constraints (if any) and the next elements the bounds for the general nonlinear constraints (if any). To specify a nonexistent lower bound (i.e., ), set , and to specify a nonexistent upper bound (i.e., ), set ; the default value of is , but this may be changed by the optional parameter Infinite Bound Size. To specify the th constraint as an equality, set , say, where .Constraints:
- , for ;
- if , .
- iter
- Type: System..::..Int32%On intermediate re-entry: must remain unchanged from a previous call to e04uf.On final exit: the number of major iterations performed.
- istate
- Type: array<System..::..Int32>[]()[][]An array of size []On initial entry: need not be set if the (default) optional parameter Cold Start is used.If the optional parameter Warm Start has been chosen, the elements of istate corresponding to the bounds and linear constraints define the initial working set for the procedure that finds a feasible point for the linear constraints and bounds. The active set at the conclusion of this procedure and the elements of istate corresponding to nonlinear constraints then define the initial working set for the first QP subproblem. More precisely, the first elements of istate refer to the upper and lower bounds on the variables, the next elements refer to the upper and lower bounds on , and the next elements refer to the upper and lower bounds on . Possible values for are as follows:
Meaning 0 The corresponding constraint is not in the initial QP working set. 1 This inequality constraint should be in the working set at its lower bound. 2 This inequality constraint should be in the working set at its upper bound. 3 This equality constraint should be in the initial working set. This value must not be specified unless . The values , and are also acceptable but will be modified by the method. If e04uf has been called previously with the same values of n, nclin and ncnln, istate already contains satisfactory information. (See also the description of the optional parameter Warm Start.) The method also adjusts (if necessary) the values supplied in x to be consistent with istate.Constraint: , for .On final exit: the status of the constraints in the QP working set at the point returned in x. The significance of each possible value of is as follows:Meaning This constraint violates its lower bound by more than the appropriate feasibility tolerance (see the optional parameters Linear Feasibility Tolerance and Nonlinear Feasibility Tolerance). This value can occur only when no feasible point can be found for a QP subproblem. This constraint violates its upper bound by more than the appropriate feasibility tolerance (see the optional parameters Linear Feasibility Tolerance and Nonlinear Feasibility Tolerance). This value can occur only when no feasible point can be found for a QP subproblem. The constraint is satisfied to within the feasibility tolerance, but is not in the QP working set. This inequality constraint is included in the QP working set at its lower bound. This inequality constraint is included in the QP working set at its upper bound. This constraint is included in the QP working set as an equality. This value of istate can occur only when .
- c
- Type: array<System..::..Double>[]()[][]An array of size [dim1]Note: the dimension of the array c must be at least .On initial entry: need not be set.On intermediate re-entry: if or and , must contain the value of the th constraint at . The remaining elements of c, corresponding to the non-positive elements of needc, are ignored.On final exit: if , contains the value of the th nonlinear constraint function at the final iterate, for .If , the array c is not referenced.
- cjac
- Type: array<System..::..Double,2>[,](,)[,][,]An array of size [dim1, dim2]Note: dim1 must satisfy the constraint:Note: the second dimension of the array cjac must be at least if , and at least otherwise.On initial entry: in general, cjac need not be initialized before the call to e04uf. However, if the optional parameter or , you may optionally set the constant elements of cjac. Such constant elements need not be re-assigned on subsequent intermediate exits.If all elements of the constraint Jacobian are known (i.e., or ), any constant elements may be assigned to cjac one time only at the start of the optimization. An element of cjac that is not subsequently assigned during an intermediate exit will retain its initial value throughout. Constant elements may be loaded into cjac either before the call to e04uf or during the first intermediate exit. The ability to preload constants is useful when many Jacobian elements are identically zero, in which case cjac may be initialized to zero and nonzero elements may be reset during intermediate exits.On intermediate re-entry: if or and , the th row of cjac must contain the available elements of the vector given bywhere is the partial derivative of the th constraint with respect to the th variable, evaluated at the point . The remaining rows of cjac, corresponding to non-positive elements of needc, are ignored. The th row of the Jacobian should be stored in elements , for and .Note that constant nonzero elements do affect the values of the constraints. Thus, if is set to a constant value, it need not be reset during subsequent intermediate exits, but the value must nonetheless be added to . For example, if and , then the term must be included in the definition of .It must be emphasized that, if or , unassigned elements of cjac are not treated as constant; they are estimated by finite differences, at nontrivial expense. If you do not supply a value for the optional parameter Difference Interval, an interval for each element of is computed automatically at the start of the optimization. The automatic procedure can usually identify constant elements of cjac, which are then computed once only by finite differences.See also the description of the optional parameter Verify.
- clamda
- Type: array<System..::..Double>[]()[][]An array of size []On initial entry: need not be set if the (default) optional parameter Cold Start is used.If the optional parameter Warm Start has been chosen, must contain a multiplier estimate for each nonlinear constraint with a sign that matches the status of the constraint specified by the istate array, for . The remaining elements need not be set. Note that if the th constraint is defined as ‘inactive’ by the initial value of the istate array (i.e. ), should be zero; if the th constraint is an inequality active at its lower bound (i.e. ), should be non-negative; if the th constraint is an inequality active at its upper bound (i.e. ), should be non-positive. If necessary, the method will modify clamda to match these rules.On final exit: the values of the QP multipliers from the last QP subproblem. should be non-negative if and non-positive if .
- objf
- Type: System..::..Double%On initial entry: need not be set.On intermediate re-entry: if or , objf must be set to the value of the objective function at .On final exit: the value of the objective function at the final iterate.
- objgrd
- Type: array<System..::..Double>[]()[][]An array of size [n]On initial entry: need not be set.On intermediate re-entry: if or , objgrd must contain the available elements of the gradient evaluated at .See also the description of the optional parameter Verify.On final exit: the gradient of the objective function at the final iterate (or its finite difference approximation).
- r
- Type: array<System..::..Double,2>[,](,)[,][,]An array of size [dim1, n]Note: dim1 must satisfy the constraint:On initial entry: need not be initialized if the (default) optional parameter Cold Start is used.If the optional parameter Warm Start has been chosen, r must contain the upper triangular Cholesky factor of the initial approximation of the Hessian of the Lagrangian function, with the variables in the natural order. Elements not in the upper triangular part of r are assumed to be zero and need not be assigned.On final exit: if , r contains the upper triangular Cholesky factor of , an estimate of the transformed and reordered Hessian of the Lagrangian at (see (6) in [Overview]).If , r contains the upper triangular Cholesky factor of , the approximate (untransformed) Hessian of the Lagrangian, with the variables in the natural order.
- x
- Type: array<System..::..Double>[]()[][]An array of size [n]On initial entry: an initial estimate of the solution.On intermediate exit: the point at which the objective function, constraint functions or their derivatives are to be evaluated.On final exit: the final estimate of the solution.
- needc
- Type: array<System..::..Int32>[]()[][]An array of size []
- iwork
- Type: array<System..::..Int32>[]()[][]An array of size [liwork]the dimension of the array iwork.On initial entry: the dimension of the array iwork as declared in the (sub)program from which e04uf is called.Constraint: .
- work
- Type: array<System..::..Double>[]()[][]An array of size [lwork]the dimension of the array work.On initial entry: the dimension of the array work as declared in the (sub)program from which e04uf is called.Constraints:
- if and , ;
- if and , ;
- if and , .
The amounts of workspace provided and required may be (by default for e04uf) output on the current advisory message unit (as defined by (X04ABF not in this release)). As an alternative to computing liwork and lwork from the formulae given above, you may prefer to obtain appropriate values from the output of a preliminary run with liwork and lwork set to . (e04uf will then terminate with .)
- options
- Type: NagLibrary..::..E04..::..e04ufOptionsAn Object of type E04.e04ufOptions. Used to configure optional parameters to this method.
- ifail
- Type: System..::..Int32%On exit: unless the method detects an error or a warning has been flagged (see [Error Indicators and Warnings]).
Description
e04uf is designed to solve the nonlinear programming problem – the minimization of a smooth nonlinear function subject to a set of constraints on the variables. The problem is assumed to be stated in the following form:
where (the objective function) is a nonlinear function, is an by constant matrix, and is an element vector of nonlinear constraint functions. (The matrix and the vector may be empty.) The objective function and the constraint functions are assumed to be smooth, i.e., at least twice-continuously differentiable. (The method of e04uf will usually solve (1) if there are only isolated discontinuities away from the solution.)
(1) |
Note that although the bounds on the variables could be included in the definition of the linear constraints, we prefer to distinguish between them for reasons of computational efficiency. For the same reason, the linear constraints should not be included in the definition of the nonlinear constraints. Upper and lower bounds are specified for all the variables and for all the constraints. An equality constraint can be specified by setting . If certain bounds are not present, the associated elements of or can be set to special values that will be treated as or . (See the description of the optional parameter Infinite Bound Size.)
If there are no nonlinear constraints in (1) and is linear or quadratic then it will generally be more efficient to use one of e04mf, e04nc or e04nf, or e04nq if the problem is large and sparse. If the problem is large and sparse and does have nonlinear constraints, e04ug should be used, since e04uf treats all matrices as dense.
e04uf uses reverse communication for evaluating , and as many of their first partial derivatives as possible; any remaining derivatives are approximated by finite differences. See the description of the optional parameter Derivative Level.
On initial entry, you must supply an initial estimate of the solution to (1).
On intermediate exits, the calling program must compute appropriate values for the objective function, the nonlinear constraints or their derivatives, as specified by the parameter irevcm, and then re-enter the method.
For maximum reliability, it is preferable to provide all partial derivatives (see Chapter 8 of Gill et al. (1981), for a detailed discussion). If they cannot all be provided, it is advisable to provide as many as possible. While developing code to evaluate the objective function and the constraints, the optional parameter Verify should be used to check the calculation of any known derivatives.
The method used by e04uf is described in detail in [Algorithmic Details].
e04wd is an alternative method which uses a similar method, but with forward communication:
that is, the objective and constraint functions are evaluated by methods, supplied as parameters to the method.
References
Dennis J E Jr and Moré J J (1977) Quasi-Newton methods, motivation and theory SIAM Rev. 19 46–89
Dennis J E Jr and Schnabel R B (1981) A new derivation of symmetric positive-definite secant updates nonlinear programming (eds O L Mangasarian, R R Meyer and S M Robinson) 4 167–199 Academic Press
Dennis J E Jr and Schnabel R B (1983) Numerical Methods for Unconstrained Optimization and Nonlinear Equations Prentice–Hall
Fletcher R (1987) Practical Methods of Optimization (2nd Edition) Wiley
Gill P E, Hammarling S, Murray W, Saunders M A and Wright M H (1986) Users' guide for LSSOL (Version 1.0) Report SOL 86-1 Department of Operations Research, Stanford University
Gill P E, Murray W, Saunders M A and Wright M H (1984a) Procedures for optimization problems with a mixture of bounds and general linear constraints ACM Trans. Math. Software 10 282–298
Gill P E, Murray W, Saunders M A and Wright M H (1984b) Users' guide for SOL/QPSOL version 3.2 Report SOL 84–5 Department of Operations Research, Stanford University
Gill P E, Murray W, Saunders M A and Wright M H (1986a) Some theoretical properties of an augmented Lagrangian merit function Report SOL 86–6R Department of Operations Research, Stanford University
Gill P E, Murray W, Saunders M A and Wright M H (1986b) Users' guide for NPSOL (Version 4.0): a Fortran package for nonlinear programming Report SOL 86-2 Department of Operations Research, Stanford University
Gill P E, Murray W and Wright M H (1981) Practical Optimization Academic Press
Hock W and Schittkowski K (1981) Test Examples for Nonlinear Programming Codes. Lecture Notes in Economics and Mathematical Systems 187 Springer–Verlag
Murtagh B A and Saunders M A (1983) MINOS 5.0 user's guide Report SOL 83-20 Department of Operations Research, Stanford University
Powell M J D (1974) Introduction to constrained optimization Numerical Methods for Constrained Optimization (eds P E Gill and W Murray) 1–28 Academic Press
Powell M J D (1983) Variable metric methods in constrained optimization Mathematical Programming: the State of the Art (eds A Bachem, M Grötschel and B Korte) 288–311 Springer–Verlag
Error Indicators and Warnings
Note: e04uf may return useful information for one or more of the following detected errors or warnings.
Errors or warnings detected by the method:
Some error messages may refer to parameters that are dropped from this interface
(LDA, LDCJ, LDR) In these
cases, an error in another parameter has usually caused an incorrect value to be inferred.
- The final iterate satisfies the first-order Kuhn–Tucker conditions (see [Overview]) to the accuracy requested, but the sequence of iterates has not yet converged. e04uf was terminated because no further improvement could be made in the merit function (see [Description of the Printed Output]).This value of ifail may occur in several circumstances. The most common situation is that you ask for a solution with accuracy that is not attainable with the given precision of the problem (as specified by the optional parameter Function Precision). This condition will also occur if, by chance, an iterate is an ‘exact’ Kuhn–Tucker point, but the change in the variables was significant at the previous iteration. (This situation often happens when minimizing very simple functions, such as quadratics.)If the four conditions listed in [Parameters] for are satisfied, is likely to be a solution of (1) even if .
- e04uf has terminated without finding a feasible point for the linear constraints and bounds, which means that either no feasible point exists for the given value of the optional parameter Linear Feasibility Tolerance, or no feasible point could be found in the number of iterations specified by the optional parameter Minor Iteration Limit. You should check that there are no constraint redundancies. If the data for the constraints are accurate only to an absolute precision , you should ensure that the value of the optional parameter Linear Feasibility Tolerance is greater than . For example, if all elements of are of order unity and are accurate to only three decimal places, Linear Feasibility Tolerance should be at least .
- No feasible point could be found for the nonlinear constraints. The problem may have no feasible solution. This means that there has been a sequence of QP subproblems for which no feasible point could be found (indicated by I at the end of each line of intermediate printout produced by the major iterations; see [Description of the Printed Output]). This behaviour will occur if there is no feasible point for the nonlinear constraints. (However, there is no general test that can determine whether a feasible point exists for a set of nonlinear constraints.) If the infeasible subproblems occur from the very first major iteration, it is highly likely that no feasible point exists. If infeasibilities occur when earlier subproblems have been feasible, small constraint inconsistencies may be present. You should check the validity of constraints with negative values of istate. If you are convinced that a feasible point does exist, e04uf should be restarted at a different starting point.
- The limiting number of iterations (as determined by the optional parameter Major Iteration Limit) has been reached.If the algorithm appears to be making satisfactory progress, then optional parameter Major Iteration Limit may be too small. If so, either increase its value and rerun e04uf or, alternatively, rerun e04uf using the optional parameter Warm Start. If the algorithm seems to be making little or no progress however, then you should check for incorrect gradients or ill-conditioning as described under .Note that ill-conditioning in the working set is sometimes resolved automatically by the algorithm, in which case performing additional iterations may be helpful. However, ill-conditioning in the Hessian approximation tends to persist once it has begun, so that allowing additional iterations without altering is usually inadvisable. If the quasi-Newton update of the Hessian approximation was reset during the latter major iterations (i.e., an R occurs at the end of each line of intermediate printout; see [Description of the Printed Output]), it may be worthwhile to try a Warm Start at the final point as suggested above.
- Not used by this method.
- does not satisfy the first-order Kuhn–Tucker conditions (see [Overview]), and no improved point for the merit function (see [Description of the Printed Output]) could be found during the final linesearch.This sometimes occurs because an overly stringent accuracy has been requested, i.e., the value of the optional parameter Optimality Tolerance (, where is the value of the optional parameter Function Precision) is too small. In this case you should apply the four tests described under to determine whether or not the final solution is acceptable (see Gill et al. (1981), for a discussion of the attainable accuracy).If many iterations have occurred in which essentially no progress has been made and e04uf has failed completely to move from the initial point, then values set by the calling program for the objective or constraint functions or their derivatives during intermediate exits may be incorrect. You should refer to comments under and check the gradients using the optional parameter Verify. Unfortunately, there may be small errors in the objective and constraint gradients that cannot be detected by the verification process. Finite difference approximations to first derivatives are catastrophically affected by even small inaccuracies. An indication of this situation is a dramatic alteration in the iterates if the finite difference interval is altered. One might also suspect this type of error if a switch is made to central differences even when Norm Gz and Violtn (see [Description of the Printed Output]) are large.Another possibility is that the search direction has become inaccurate because of ill-conditioning in the Hessian approximation or the matrix of constraints in the working set; either form of ill-conditioning tends to be reflected in large values of Mnr (the number of iterations required to solve each QP subproblem; see [Description of the Printed Output]).If the condition estimate of the projected Hessian (Cond Hz; see [Description of the Printed Output]) is extremely large, it may be worthwhile rerunning e04uf from the final point with the optional parameter Warm Start. In this situation, istate and clamda should be left unaltered and should be reset to the identity matrix.If the matrix of constraints in the working set is ill-conditioned (i.e., Cond T is extremely large; see [Description of Monitoring Information]), it may be helpful to run e04uf with a relaxed value of the optional parameter Feasibility Tolerance. (Constraint dependencies are often indicated by wide variations in size in the diagonal elements of the matrix , whose diagonals will be printed if .)
- The user-supplied derivatives of the objective function and/or nonlinear constraints appear to be incorrect.Large errors were found in the derivatives of the objective function and/or nonlinear constraints. This value of ifail will occur if the verification process indicated that at least one gradient or Jacobian element had no correct figures. You should refer to the printed output to determine which elements are suspected to be in error.As a first-step, you should check that the code for the objective and constraint values is correct – for example, by computing the function at a point where the correct value is known. However, care should be taken that the chosen point fully tests the evaluation of the function. It is remarkable how often the values or are used in such a test, and how often the special properties of these numbers make the test meaningless.Special care should be used in the test if computation of the objective function involves subsidiary data communicated in storage. Although the first evaluation of the function may be correct, subsequent calculations may be in error because some of the subsidiary data has accidentally been overwritten.Gradient checking will be ineffective if the objective function uses information computed by the constraints, since they are not necessarily computed before each function evaluation.Errors in programming the function may be quite subtle in that the function value is ‘almost’ correct. For example, the function may not be accurate to full precision because of the inaccurate calculation of a subsidiary quantity, or the limited accuracy of data upon which the function depends. A common error on machines where numerical calculations are usually performed in double precision is to include even one single precision constant in the calculation of the function; since some compilers do not convert such constants to double precision, half the correct figures may be lost by such a seemingly trivial error.
- Not used by this method.
- An input parameter is invalid.
- If overflow occurs then either an element of is very large, or the singular values or singular vectors have been incorrectly supplied.
Accuracy
If on final exit then the vector returned in the array x is an estimate of the solution to an accuracy of approximately Optimality Tolerance (, where is the machine precision).
Parallelism and Performance
None.
Further Comments
Description of the Printed Output
This section describes the intermediate printout and final printout produced by e04uf. The intermediate printout is a subset of the monitoring information produced by e04uf at every iteration (see [Description of Monitoring Information]). You can control the level of printed output (see the description of the optional parameter Major Print Level). Note that the intermediate printout and final printout are produced only if (the default for e04uf, by default no output is produced by ).
The following line of summary output ( characters) is produced at every major iteration. In all cases, the values of the quantities printed are those in effect on completion of the given iteration.
Maj | is the major iteration count. |
Mnr |
is the number of minor iterations required by the feasibility and optimality phases of the QP subproblem. Generally, Mnr will be in the later iterations, since theoretical analysis predicts that the correct active set will be identified near the solution
(see [Algorithmic Details]).
Note that Mnr may be greater than the optional parameter Minor Iteration Limit if some iterations are required for the feasibility phase.
|
Step | is the step taken along the computed search direction. On reasonably well-behaved problems, the unit step (i.e., ) will be taken as the solution is approached. |
Merit Function |
is the value of the augmented Lagrangian merit function (12) at the current iterate. This function will decrease at each iteration unless it was necessary to increase the penalty parameters
(see [The Merit Function]).
As the solution is approached, Merit Function will converge to the value of the objective function at the solution.
If the QP subproblem does not have a feasible point (signified by I at the end of the current output line) then the merit function is a large multiple of the constraint violations, weighted by the penalty parameters. During a sequence of major iterations with infeasible subproblems, the sequence of Merit Function values will decrease monotonically until either a feasible subproblem is obtained or e04uf terminates with (no feasible point could be found for the nonlinear constraints).
If there are no nonlinear constraints present (i.e., ) then this entry contains Objective, the value of the objective function . The objective function will decrease monotonically to its optimal value when there are no nonlinear constraints.
|
Norm Gz | is , the Euclidean norm of the projected gradient (see [Solution of the Quadratic Programming Subproblem]). Norm Gz will be approximately zero in the neighbourhood of a solution. |
Violtn | is the Euclidean norm of the residuals of constraints that are violated or in the predicted active set (not printed if ncnln is zero). Violtn will be approximately zero in the neighbourhood of a solution. |
Cond Hz | is a lower bound on the condition number of the projected Hessian approximation (; see (6)). The larger this number, the more difficult the problem. |
M | is printed if the quasi-Newton update has been modified to ensure that the Hessian approximation is positive definite (see [The Quasi-Newton Update]). |
I | is printed if the QP subproblem has no feasible point. |
C | is printed if central differences have been used to compute the unspecified objective and constraint gradients. If the value of Step is zero then the switch to central differences was made because no lower point could be found in the linesearch. (In this case, the QP subproblem is resolved with the central difference gradient and Jacobian.) If the value of Step is nonzero then central differences were computed because Norm Gz and Violtn imply that is close to a Kuhn–Tucker point (see [Overview]). |
L | is printed if the linesearch has produced a relative change in greater than the value defined by the optional parameter Step Limit. If this output occurs frequently during later iterations of the run, optional parameter Step Limit should be set to a larger value. |
R | is printed if the approximate Hessian has been refactorized. If the diagonal condition estimator of indicates that the approximate Hessian is badly conditioned then the approximate Hessian is refactorized using column interchanges. If necessary, is modified so that its diagonal condition estimator is bounded. |
The final printout includes a listing of the status of every variable and constraint.
The following describes the printout for each variable. A full stop (.) is printed for any numerical value that is zero.
Varbl | gives the name (V) and index , for , of the variable. | ||||||
State |
gives the state of the variable (FR if neither bound is in the working set, EQ if a fixed variable, LL if on its lower bound, UL if on its upper bound, TF if temporarily fixed at its current value). If Value lies outside the upper or lower bounds by more than the Feasibility Tolerance, State will be ++ or -- respectively.
A key is sometimes printed before State.
|
||||||
Value | is the value of the variable at the final iteration. | ||||||
Lower Bound | is the lower bound specified for the variable. None indicates that . | ||||||
Upper Bound | is the upper bound specified for the variable. None indicates that . | ||||||
Lagr Mult | is the Lagrange multiplier for the associated bound. This will be zero if State is FR unless and , in which case the entry will be blank. If is optimal, the multiplier should be non-negative if State is LL and non-positive if State is UL. | ||||||
Slack | is the difference between the variable Value and the nearer of its (finite) bounds and . A blank entry indicates that the associated variable is not bounded (i.e., and ). |
The meaning of the printout for linear and nonlinear constraints is the same as that given above for variables, with ‘variable’ replaced by ‘constraint’, and replaced by and respectively and with the following changes in the heading:
L Con | gives the name (L) and index , for , of the linear constraint. |
N Con | gives the name (N) and index (), for , of the nonlinear constraint. |
Note that movement off a constraint (as opposed to a variable moving away from its bound) can be interpreted as allowing the entry in the Slack column to become positive.
Numerical values are output with a fixed number of digits; they are not guaranteed to be accurate to this precision.
Example
This is based on Problem 71 in Murtagh and Saunders (1983) and involves the minimization of the nonlinear function
subject to the bounds
to the general linear constraint
and to the nonlinear constraints
The initial point, which is infeasible, is
and .
The optimal solution (to five figures) is
and . One bound constraint and both nonlinear constraints are active at the solution.
Example program (C#): e04ufe.cs
Algorithmic Details
This section contains a detailed description of the method used by e04uf.
Overview
e04uf is essentially identical to the method NPSOL described in Gill et al. (1986b).
At a solution of (1), some of the constraints will be active, i.e., satisfied exactly. An active simple bound constraint implies that the corresponding variable is fixed at its bound, and hence the variables are partitioned into fixed and free variables. Let denote the by matrix of gradients of the active general linear and nonlinear constraints. The number of fixed variables will be denoted by , with () the number of free variables. The subscripts ‘FX’ and ‘FR’ on a vector or matrix will denote the vector or matrix composed of the elements corresponding to fixed or free variables.
A point is a first-order Kuhn–Tucker point for (1) (see Powell (1974)) if the following conditions hold:
(i) | is feasible; | ||
(ii) | there exist vectors and (the Lagrange multiplier vectors for the bound and general constraints) such that
|
||
(iii) | the Lagrange multiplier corresponding to an inequality constraint active at its lower bound must be non-negative. It is non-positive for an inequality constraint active at its upper bound. |
Let denote a matrix whose columns form a basis for the set of vectors orthogonal to the rows of ; i.e., . An equivalent statement of the condition (2) in terms of is
The vector is termed the projected gradient of at . Certain additional conditions must be satisfied in order for a first-order Kuhn–Tucker point to be a solution of (1) (see Powell (1974)).
e04uf implements a sequential quadratic programming (SQP) method. For an overview of SQP methods, see Fletcher (1987), Gill et al. (1981) and Powell (1983).
The basic structure of e04uf involves major and minor iterations. The major iterations generate a sequence of iterates that converge to , a first-order Kuhn–Tucker point of (1). At a typical major iteration, the new iterate is defined by
where is the current iterate, the non-negative scalar is the step length, and is the search direction. (For simplicity, we shall always consider a typical iteration and avoid reference to the index of the iteration.) Also associated with each major iteration are estimates of the Lagrange multipliers and a prediction of the active set.
(3) |
The search direction in (3) is the solution of a quadratic programming subproblem of the form
where is the gradient of at , the matrix is a positive definite quasi-Newton approximation to the Hessian of the Lagrangian function (see [The Quasi-Newton Update]), and is the Jacobian matrix of evaluated at . (Finite difference estimates may be used for and ; see the optional parameter Derivative Level.) Let in (1) be partitioned into three sections: , and , corresponding to the bound, linear and nonlinear constraints. The vector in (4) is similarly partitioned and is defined as
where is the vector of nonlinear constraints evaluated at . The vector is defined in an analogous fashion.
(4) |
The estimated Lagrange multipliers at each major iteration are the Lagrange multipliers from the subproblem (4) (and similarly for the predicted active set). (The numbers of bounds, general linear and nonlinear constraints in the QP active set are the quantities Bnd, Lin and Nln in the monitoring file output of e04uf; see [Description of Monitoring Information].) In e04uf, (4) is solved using e04nc. Since solving a quadratic program is itself an iterative procedure, the minor iterations of e04uf are the iterations of e04nc. (More details about solving the subproblem are given in [Solution of the Quadratic Programming Subproblem].)
Certain matrices associated with the QP subproblem are relevant in the major iterations. Let the subscripts ‘FX’ and ‘FR’ refer to the predicted fixed and free variables, and let denote the by matrix of gradients of the general linear and nonlinear constraints in the predicted active set. Firstly, we have available the factorization of :
where is a nonsingular by reverse-triangular matrix (i.e., if ), and the nonsingular by matrix is the product of orthogonal transformations (see Gill et al. (1984b)). Secondly, we have the upper triangular Cholesky factor of the transformed and reordered Hessian matrix
where is the Hessian with rows and columns permuted so that the free variables are first and is the by matrix
with the identity matrix of order . If the columns of are partitioned so that
then the () columns of form a basis for the null space of . The matrix is used to compute the projected gradient at the current iterate. (The values Nz and Norm Gz printed by e04uf give and , see [Description of Monitoring Information].)
(5) |
(6) |
(7) |
A theoretical characteristic of SQP methods is that the predicted active set from the QP subproblem (4) is identical to the correct active set in a neighbourhood of . In e04uf, this feature is exploited by using the QP active set from the previous iteration as a prediction of the active set for the next QP subproblem, which leads in practice to optimality of the subproblems in only one iteration as the solution is approached. Separate treatment of bound and linear constraints in e04uf also saves computation in factorizing and .
Once has been computed, the major iteration proceeds by determining a step length that produces a ‘sufficient decrease’ in an augmented Lagrangian merit function (see [The Merit Function]). Finally, the approximation to the transformed Hessian matrix is updated using a modified BFGS quasi-Newton update (see [The Quasi-Newton Update]) to incorporate new curvature information obtained in the move from to .
On entry to e04uf, an iterative procedure from e04nc is executed, starting with the user-supplied initial point, to find a point that is feasible with respect to the bounds and linear constraints (using the tolerance specified by the optional parameter Linear Feasibility Tolerance). If no feasible point exists for the bound and linear constraints, (1) has no solution and e04uf terminates. Otherwise, the problem functions will thereafter be evaluated only at points that are feasible with respect to the bounds and linear constraints. The only exception involves variables whose bounds differ by an amount comparable to the finite difference interval (see the discussion of the optional parameter Difference Interval). In contrast to the bounds and linear constraints, it must be emphasized that the nonlinear constraints will not generally be satisfied until an optimal point is reached.
Facilities are provided to check whether the user-supplied gradients appear to be correct (see the description of the optional parameter Verify). In general, the check is provided at the first point that is feasible with respect to the linear constraints and bounds. However, you may request that the check be performed at the initial point.
In summary, the method of e04uf first determines a point that satisfies the bound and linear constraints. Thereafter, each iteration includes:
(a) | the solution of a quadratic programming subproblem; |
(b) | a linesearch with an augmented Lagrangian merit function; and |
(c) | a quasi-Newton update of the approximate Hessian of the Lagrangian function. |
These three procedures are described in more detail in [Solution of the Quadratic Programming Subproblem] to [The Quasi-Newton Update].
Solution of the Quadratic Programming Subproblem
The search direction is obtained by solving (4) using e04nc (see Gill et al. (1986)), which was specifically designed to be used within an SQP algorithm for nonlinear programming.
e04nc is based on a two-phase (primal) quadratic programming method. The two phases of the method are: finding an initial feasible point by minimizing the sum of infeasibilities (the feasibility phase) and minimizing the quadratic objective function within the feasible region (the optimality phase). The computations in both phases are performed by the same methods. The two-phase nature of the algorithm is reflected by changing the function being minimized from the sum of infeasibilities to the quadratic objective function.
In general, a quadratic program must be solved by iteration. Let denote the current estimate of the solution of (4); the new iterate is defined by
where, as in (3), is a non-negative step length and is a search direction.
(8) |
At the beginning of each iteration of e04nc, a working set is defined of constraints (general and bound) that are satisfied exactly. The vector is then constructed so that the values of constraints in the working set remain unaltered for any move along . For a bound constraint in the working set, this property is achieved by setting the corresponding element of to zero, i.e., by fixing the variable at its bound. As before, the subscripts ‘FX’ and ‘FR’ denote selection of the elements associated with the fixed and free variables.
Let denote the sub-matrix of rows of
corresponding to general constraints in the working set. The general constraints in the working set will remain unaltered if
which is equivalent to defining as
for some vector , where is the matrix associated with the factorization (5) of .
(9) |
(10) |
The definition of in (10) depends on whether the current is feasible. If not, is zero except for an element in the th position, where and are chosen so that the sum of infeasibilities is decreasing along . (For further details, see Gill et al. (1986).) In the feasible case, satisfies the equations
where is the Cholesky factor of and is the gradient of the quadratic objective function . (The vector is the projected gradient of the QP.) With (11), is the minimizer of the quadratic objective function subject to treating the constraints in the working set as equalities.
(11) |
If the QP projected gradient is zero, the current point is a constrained stationary point in the subspace defined by the working set. During the feasibility phase, the projected gradient will usually be zero only at a vertex (although it may vanish at non-vertices in the presence of constraint dependencies). During the optimality phase, a zero projected gradient implies that minimizes the quadratic objective function when the constraints in the working set are treated as equalities. In either case, Lagrange multipliers are computed. Given a positive constant of the order of the machine precision, the Lagrange multiplier corresponding to an inequality constraint in the working set is said to be optimal if when the th constraint is at its upper bound, or if when the associated constraint is at its lower bound. If any multiplier is nonoptimal, the current objective function (either the true objective or the sum of infeasibilities) can be reduced by deleting the corresponding constraint from the working set.
If optimal multipliers occur during the feasibility phase and the sum of infeasibilities is nonzero, no feasible point exists. The QP algorithm will then continue iterating to determine the minimum sum of infeasibilities. At this point, the Lagrange multiplier will satisfy for an inequality constraint at its upper bound, and for an inequality at its lower bound. The Lagrange multiplier for an equality constraint will satisfy .
The choice of step length in the QP iteration (8) is based on remaining feasible with respect to the satisfied constraints. During the optimality phase, if is feasible, will be taken as unity. (In this case, the projected gradient at will be zero.) Otherwise, is set to , the step to the ‘nearest’ constraint, which is added to the working set at the next iteration.
Each change in the working set leads to a simple change to : if the status of a general constraint changes, a row of is altered; if a bound constraint enters or leaves the working set, a column of changes. Explicit representations are recurred of the matrices , and , and of the vectors and .
The Merit Function
After computing the search direction as described in [Solution of the Quadratic Programming Subproblem], each major iteration proceeds by determining a step length in (3) that produces a ‘sufficient decrease’ in the augmented Lagrangian merit function
where , and vary during the linesearch. The summation terms in (12) involve only the nonlinear constraints. The vector is an estimate of the Lagrange multipliers for the nonlinear constraints of (1). The non-negative slack variables
allow nonlinear inequality constraints to be treated without introducing discontinuities. The solution of the QP subproblem (4) provides a vector triple that serves as a direction of search for the three sets of variables. The non-negative vector of penalty parameters is initialized to zero at the beginning of the first major iteration. Thereafter, selected elements are increased whenever necessary to ensure descent for the merit function. Thus, the sequence of norms of (the printed quantity Penalty; see [Description of Monitoring Information]) is generally nondecreasing, although each may be reduced a limited number of times.
(12) |
The merit function (12) and its global convergence properties are described in Gill et al. (1986a).
The Quasi-Newton Update
The matrix in (4) is a positive definite quasi-Newton approximation to the Hessian of the Lagrangian function. (For a review of quasi-Newton methods, see Dennis and Schnabel (1983).) At the end of each major iteration, a new Hessian approximation is defined as a rank-two modification of . In e04uf, the BFGS (Broyden–Fletcher–Goldfarb–Shanno) quasi-Newton update is used:
where (the change in ).
(13) |
In e04uf, is required to be positive definite. If is positive definite, defined by (13) will be positive definite if and only if is positive (see Dennis and Moré (1977)). Ideally, in (13) would be taken as , the change in gradient of the Lagrangian function
where denotes the QP multipliers associated with the nonlinear constraints of the original problem. If is not sufficiently positive, an attempt is made to perform the update with a vector of the form
where . If no such vector can be found, the update is performed with a scaled . In this case, M is printed to indicate that the update was modified.
(14) |
Rather than modifying itself, the Cholesky factor of the transformed Hessian
(6) is updated, where is the matrix from (5) associated with the active set of the QP subproblem. The update (13) is equivalent to the following update to :
where , and . This update may be expressed as a rank-one update to (see Dennis and Schnabel (1981)).
(15) |
Description of Monitoring Information
This section describes the long line of output ( characters) which forms part of the monitoring information produced by e04uf. (See also the description of the optional parameters Major Print Level, Minor Print Level and Monitoring File.) You can control the level of printed output (see the description of the optional parameter Major Print Level).
When and , the following line of output is produced at every major iteration of e04uf
on the unit number
specified by Monitoring File. In all cases, the values of the quantities printed are those in effect on completion of the given iteration.
Maj | is the major iteration count. | ||||||
Mnr |
is the number of minor iterations required by the feasibility and optimality phases of the QP subproblem. Generally, Mnr will be in the later iterations, since theoretical analysis predicts that the correct active set will be identified near the solution
(see [Algorithmic Details]).
Note that Mnr may be greater than the optional parameter Minor Iteration Limit if some iterations are required for the feasibility phase.
|
||||||
Step | is the step taken along the computed search direction. On reasonably well-behaved problems, the unit step (i.e., ) will be taken as the solution is approached. | ||||||
Nfun | is the cumulative number of evaluations of the objective function needed for the linesearch. Evaluations needed for the estimation of the gradients by finite differences are not included. Nfun is printed as a guide to the amount of work required for the linesearch. | ||||||
Merit Function |
is the value of the augmented Lagrangian merit function (12) at the current iterate. This function will decrease at each iteration unless it was necessary to increase the penalty parameters
(see [The Merit Function]).
As the solution is approached, Merit Function will converge to the value of the objective function at the solution.
If the QP subproblem does not have a feasible point (signified by I at the end of the current output line) then the merit function is a large multiple of the constraint violations, weighted by the penalty parameters. During a sequence of major iterations with infeasible subproblems, the sequence of Merit Function values will decrease monotonically until either a feasible subproblem is obtained or e04uf terminates with (no feasible point could be found for the nonlinear constraints).
If there are no nonlinear constraints present (i.e., ) then this entry contains Objective, the value of the objective function . The objective function will decrease monotonically to its optimal value when there are no nonlinear constraints.
|
||||||
Norm Gz | is , the Euclidean norm of the projected gradient (see [Solution of the Quadratic Programming Subproblem]). Norm Gz will be approximately zero in the neighbourhood of a solution. | ||||||
Violtn | is the Euclidean norm of the residuals of constraints that are violated or in the predicted active set (not printed if ncnln is zero). Violtn will be approximately zero in the neighbourhood of a solution. | ||||||
Nz | is the number of columns of (see [Solution of the Quadratic Programming Subproblem]). The value of Nz is the number of variables minus the number of constraints in the predicted active set; i.e., . | ||||||
Bnd | is the number of simple bound constraints in the predicted active set. | ||||||
Lin | is the number of general linear constraints in the predicted working set. | ||||||
Nln | is the number of nonlinear constraints in the predicted active set (not printed if ncnln is zero). | ||||||
Penalty | is the Euclidean norm of the vector of penalty parameters used in the augmented Lagrangian merit function (not printed if ncnln is zero). | ||||||
Cond H | is a lower bound on the condition number of the Hessian approximation . | ||||||
Cond Hz | is a lower bound on the condition number of the projected Hessian approximation (; see (6)). The larger this number, the more difficult the problem. | ||||||
Cond T | is a lower bound on the condition number of the matrix of predicted active constraints. | ||||||
Conv |
is a three-letter indication of the status of the three convergence tests (16)–(18) defined in the description of the optional parameter Optimality Tolerance. Each letter is T if the test is satisfied and F otherwise. The three tests indicate whether:
If any of these indicators is F when e04uf terminates with , you should check the solution carefully.
|
||||||
M | is printed if the quasi-Newton update has been modified to ensure that the Hessian approximation is positive definite (see [The Quasi-Newton Update]). | ||||||
I | is printed if the QP subproblem has no feasible point. | ||||||
C | is printed if central differences have been used to compute the unspecified objective and constraint gradients. If the value of Step is zero then the switch to central differences was made because no lower point could be found in the linesearch. (In this case, the QP subproblem is resolved with the central difference gradient and Jacobian.) If the value of Step is nonzero then central differences were computed because Norm Gz and Violtn imply that is close to a Kuhn–Tucker point (see [Overview]). | ||||||
L | is printed if the linesearch has produced a relative change in greater than the value defined by the optional parameter Step Limit. If this output occurs frequently during later iterations of the run, optional parameter Step Limit should be set to a larger value. | ||||||
On entry: need not be initialized if the (default) optional parameter Cold Start is used.
If the optional parameter Warm Start has been chosen, r must contain the upper triangular Cholesky factor of the initial approximation of the Hessian of the Lagrangian function, with the variables in the natural order. Elements not in the upper triangular part of r are assumed to be zero and need not be assigned.
On exit: if , r contains the upper triangular Cholesky factor of , an estimate of the transformed and reordered Hessian of the Lagrangian at (see (6) in [Overview]). If , r contains the upper triangular Cholesky factor of , the approximate (untransformed) Hessian of the Lagrangian, with the variables in the natural order.
|