naginterfaces.library.opt.nlp2_solve¶

naginterfaces.library.opt.nlp2_solve(a, bl, bu, objfun, istate, ccon, cjac, clamda, h, x, comm, confun=None, data=None, io_manager=None, spiked_sorder='C')[source]¶

nlp2_solve is designed to minimize an arbitrary smooth function subject to constraints (which may include simple bounds on the variables, linear constraints and smooth nonlinear constraints) using a Sequential Quadratic Programming (SQP) method. As many first derivatives as possible should be supplied by you; any unspecified derivatives are approximated by finite differences. It is not intended for large sparse problems.

nlp2_solve may also be used for unconstrained, bound-constrained and linearly constrained optimization.

nlp2_solve uses forward communication for evaluating the objective function, the nonlinear constraint functions, and any of their derivatives.

The initialization function nlp2_init() must have been called before to calling nlp2_solve.

Note: this function uses optional algorithmic parameters, see also: nlp2_option_file(), nlp2_option_string(), nlp2_option_integer_set(), nlp2_option_double_set(), nlp2_init().

For full information please refer to the NAG Library document for e04wd

https://support.nag.com/numeric/nl/nagdoc_31.1/flhtml/e04/e04wdf.html

Parameters

afloat, array-like, shape $(nclin, :)$

Note: the required extent for this argument in dimension 2 is determined as follows: if $nclin > 0$ : $n$ ; otherwise: $1$ .

The $i$ th row of $a$ contains the $i$ th row of the matrix $A_{L}$ of general linear constraints in (1). That is, the $i$ th row contains the coefficients of the $i$ th general linear constraint, for $i = 1, 2, \dots, nclin$ .

blfloat, array-like, shape $(n + nclin + ncnln)$

$b l$ must contain the lower bounds for all the constraints

bufloat, array-like, shape $(n + nclin + ncnln)$

$b u$ must contain the upper bounds for all the constraints

objfuncallable (mode, objf, grad) = objfun(mode, x, grad, nstate, data=None)

$o b j f u n$ must calculate the objective function $F (x)$ and (optionally) its gradient $g (x) = \frac{\partial F}{\partial x}$ for a specified $n$ -vector $x$ .

Parameters

modeint

Is set by nlp2_solve to indicate which values must be assigned during each call of $o b j f u n$ . Only the following values need be assigned:

$m o d e = 0$

$o b j f$ .

$m o d e = 1$

All available elements of $g r a d$ .

$m o d e = 2$

$o b j f$ and all available elements of $g r a d$ .

xfloat, ndarray, shape $(n)$

$x$ , the vector of variables at which the objective function and/or all available elements of its gradient are to be evaluated.

gradfloat, ndarray, shape $(n)$

The elements of $g r a d$ are set to special values.

nstateint

If $n s t a t e = 1$ , then nlp2_solve is calling $o b j f u n$ for the first time. This argument setting allows you to save computation time if certain data must be read or calculated only once.

dataarbitrary, optional, modifiable in place

User-communication data for callback functions.

Returns

modeint

May be used to indicate that you are unable or unwilling to evaluate the objective function at the current $x$ .

During the linesearch, the function is evaluated at points of the form $x = x_{k} + α p_{k}$ after they have already been evaluated satisfactorily at $x_{k}$ .

For any such $x$ , if you set $m o d e = - 1$ , nlp2_solve will reduce $α$ and evaluate the functions again (closer to $x_{k}$ , where they are more likely to be defined).

If for some reason you wish to terminate the current problem, set $m o d e < - 1$ .

objffloat

If $m o d e = 0$ or $2$ , $o b j f$ must be set to the value of the objective function at $x$ .

gradfloat, array-like, shape $(n)$

If $m o d e = 1$ or $2$ , $g r a d$ must return the available elements of the gradient evaluated at $x$ , i.e., $g r a d [i - 1]$ contains the partial derivative $\frac{\partial F}{\partial x_{i}}$ .

istateint, array-like, shape $(n + nclin + ncnln)$

Is an integer array that need not be initialized if nlp2_solve is called with the ‘Cold Start’ option (the default).

If option ‘Warm Start’ has been chosen, every element of $i s t a t e$ must be set.

If nlp2_solve has just been called on a problem with the same dimensions, $i s t a t e$ already contains valid values.

Otherwise, $i s t a t e [j - 1]$ should indicate whether either of the constraints $r_{j} (x) \geq l_{j}$ or $r_{j} (x) \leq u_{j}$ is expected to be active at a solution of (1).

The ordering of $i s t a t e$ is the same as for $b l$ , $b u$ and $r (x)$ , i.e., the first $n$ components of $i s t a t e$ refer to the upper and lower bounds on the variables, the next $nclin$ refer to the bounds on $A_{L} x$ , and the last $ncnln$ refer to the bounds on $c (x)$ .

Possible values of $i s t a t e [i - 1]$ follow:

$0$	Neither $r_{j} (x) \geq l_{j}$ nor $r_{j} (x) \leq u_{j}$ is expected to be active.
$1$	$r_{j} (x) \geq l_{j}$ is expected to be active.
$2$	$r_{j} (x) \leq u_{j}$ is expected to be active.
$3$	This may be used if $l_{j} = u_{j}$ . Normally an equality constraint $r_{j} (x) = l_{j} = u_{j}$ is active at a solution.

The values $1$ , $2$ or $3$ all have the same effect when $b l [j - 1] = b u [j - 1]$ .

If necessary, nlp2_solve will override your specification of $i s t a t e$ , so that a poor choice will not cause the algorithm to fail.

cconfloat, array-like, shape $(max (1, ncnln))$

$c c o n$ need not be initialized if the (default) option ‘Cold Start’ is used.

For a ‘Warm Start’, and if $ncnln > 0$ , $c c o n$ contains values of the nonlinear constraint functions $c_{i}$ , for $i = 1, 2, \dots, ncnln$ , calculated in a previous call to nlp2_solve.

cjacfloat, array-like, shape $(ncnln, :)$

Note: the required extent for this argument in dimension 2 is determined as follows: if $ncnln > 0$ : $n$ ; otherwise: $1$ .

In general, $c j a c$ need not be initialized before the call to nlp2_solve. However, if $‘Derivative Level' = 2$ or $3$ , any constant elements of $c j a c$ may be initialized. Such elements need not be reassigned on subsequent calls to $c o n f u n$ .

clamdafloat, array-like, shape $(n + nclin + ncnln)$

Need not be set if the (default) option ‘Cold Start’ is used.

If the option ‘Warm Start’ has been chosen, $c l a m d a [j - 1]$ must contain a multiplier estimate for each nonlinear constraint, with a sign that matches the status of the constraint specified by the $i s t a t e$ array, for $j = n + nclin + 1, \dots, n + nclin + ncnln$ .

The remaining elements need not be set.

If the $j$ th constraint is defined as ‘inactive’ by the initial value of the $i s t a t e$ array (i.e., $i s t a t e [j - 1] = 0$ ), $c l a m d a [j - 1]$ should be zero; if the $j$ th constraint is an inequality active at its lower bound (i.e., $i s t a t e [j - 1] = 1$ ), $c l a m d a [j - 1]$ should be non-negative; if the $j$ th constraint is an inequality active at its upper bound (i.e., $i s t a t e [j - 1] = 2$ ), $c l a m d a [j - 1]$ should be non-positive.

If necessary, the function will modify $c l a m d a$ to match these rules.

hfloat, array-like, shape $(n, n)$

$h$ need not be initialized if the (default) option ‘Cold Start’ is used, and will be set to the identity.

If the option ‘Warm Start’ has been chosen, $h$ provides the initial approximation of the Hessian of the Lagrangian, i.e., $h [i - 1, j - 1] \approx \frac{\partial^{2} L (x, λ)}{\partial x_{i} \partial x_{j}}$ , where $L (x, λ) = F (x) - {c (x)}^{T} λ$ and $λ$ is an estimate of the Lagrange multipliers. $h$ must be a positive definite matrix.

xfloat, array-like, shape $(n)$

An initial estimate of the solution.

commdict, communication object, modified in place

Communication structure.

This argument must have been initialized by a prior call to nlp2_init().

confunNone or callable (mode, ccon, cjac) = confun(mode, needc, x, cjac, nstate, data=None), optional

Note: if this argument is None then a NAG-supplied facility will be used.

$c o n f u n$ must calculate the vector $c (x)$ of nonlinear constraint functions and (optionally) its Jacobian, $\frac{\partial c}{\partial x}$ , for a specified $n$ -vector $x$ .

If there are no nonlinear constraints (i.e., $ncnln = 0$ ), nlp2_solve will never call $c o n f u n$ , so it may be None. If there are nonlinear constraints, the first call to $c o n f u n$ will occur before the first call to $o b j f u n$ .

If all constraint gradients (Jacobian elements) are known (i.e., $‘Derivative Level' = 2$ or $3$ ), any constant elements may be assigned to $c j a c$ once only at the start of the optimization.

An element of $c j a c$ that is not subsequently assigned in $c o n f u n$ will retain its initial value throughout.

Constant elements may be loaded in $c j a c$ during the first call to $c o n f u n$ (signalled by the value of $n s t a t e = 1$ ).

The ability to preload constants is useful when many Jacobian elements are identically zero, in which case $c j a c$ may be initialized to zero and nonzero elements may be reset by $c o n f u n$ .

It must be emphasized that, if $‘Derivative Level' < 2$ , unassigned elements of $c j a c$ are not treated as constant; they are estimated by finite differences, at nontrivial expense.

Parameters

modeint

Is set by nlp2_solve to indicate which values must be assigned during each call of $c o n f u n$ . Only the following values need be assigned, for each value of $i$ such that $n e e d c [i - 1] > 0$ :

$m o d e = 0$

The components of $c c o n$ corresponding to positive values in $n e e d c$ must be set. Other components and the array $c j a c$ are ignored.

$m o d e = 1$

The known components of the rows of $c j a c$ corresponding to positive values in $n e e d c$ must be set. Other rows of $c j a c$ and the array $c c o n$ will be ignored.

$m o d e = 2$

Only the elements of $c c o n$ corresponding to positive values of $n e e d c$ need to be set (and similarly for the known components of the rows of $c j a c$ ).

needcint, ndarray, shape $(ncnln)$

The indices of the elements of $c c o n$ and/or $c j a c$ that must be evaluated by $c o n f u n$ . If $n e e d c [i - 1] > 0$ , the $i$ th element of $c c o n$ and/or the available elements of the $i$ th row of $c j a c$ (see argument $m o d e$ ) must be evaluated at $x$ .

xfloat, ndarray, shape $(n)$

$x$ , the vector of variables at which the constraint functions and/or the available elements of the constraint Jacobian are to be evaluated.

cjacfloat, ndarray, shape $(max (1, ncnln), n)$

The elements of $c j a c$ are set to special values that enable nlp2_solve to detect whether they are reset by $c o n f u n$ .

nstateint

If $n s t a t e = 1$ , then nlp2_solve is calling $c o n f u n$ for the first time. This argument setting allows you to save computation time if certain data must be read or calculated only once.

dataarbitrary, optional, modifiable in place

User-communication data for callback functions.

Returns

modeint

May be used to indicate that you are unable or unwilling to evaluate the constraint functions at the current $x$ .

During the linesearch, the constraint functions are evaluated at points of the form $x = x_{k} + α p_{k}$ after they have already been evaluated satisfactorily at $x_{k}$ .

At any such $α$ , if you set $m o d e = - 1$ , nlp2_solve will evaluate the functions at some point closer to $x_{k}$ (where they are more likely to be defined).

If for some reason you wish to terminate the current problem, set $m o d e < - 1$ .

cconfloat, array-like, shape $(max (1, ncnln))$

If $n e e d c [i - 1] > 0$ and $m o d e = 0$ or $2$ , $c c o n [i - 1]$ must contain the value of the $i$ th constraint at $x$ . The remaining elements of $c c o n$ , corresponding to the non-positive elements of $n e e d c$ , are ignored.

cjacfloat, array-like, shape $(max (1, ncnln), n)$

If $n e e d c [i - 1] > 0$ and $m o d e = 1$ or $2$ , the $i$ th row of $c j a c$ must contain the available elements of the vector $\nabla c_{i}$ given by

\nabla c_{i} = {(\frac{\partial c_{i}}{\partial x_{1}}, \frac{\partial c_{i}}{\partial x_{2}}, \dots, \frac{\partial c_{i}}{\partial x_{n}})}^{T},

where $\frac{\partial c_{i}}{\partial x_{j}}$ is the partial derivative of the $i$ th constraint with respect to the $j$ th variable, evaluated at the point $x$ . See also the argument $n s t a t e$ . The remaining rows of $c j a c$ , corresponding to non-positive elements of $n e e d c$ , are ignored.

If all elements of the constraint Jacobian are known (i.e., $‘Derivative Level' = 2$ or $3$ ), any constant elements may be assigned to $c j a c$ one time only at the start of the optimization.

An element of $c j a c$ that is not subsequently assigned in $c o n f u n$ will retain its initial value throughout.

Constant elements may be loaded into $c j a c$ during the first call to $c o n f u n$ (signalled by the value $n s t a t e = 1$ ).

The ability to preload constants is useful when many Jacobian elements are identically zero, in which case $c j a c$ may be initialized to zero and nonzero elements may be reset by $c o n f u n$ .

Note that constant nonzero elements do affect the values of the constraints.

Thus, if $c j a c [i - 1, j - 1]$ is set to a constant value, it need not be reset in subsequent calls to $c o n f u n$ , but the value $c j a c [i - 1, j - 1] \times x [j - 1]$ must nonetheless be added to $c c o n [i - 1]$ .

For example, if $c j a c [0, 0] = 2$ and $c j a c [0, 1] = - 5$ , then the term $2 \times x [0] - 5 \times x [1]$ must be included in the definition of $c c o n [0]$ .

It must be emphasized that, if $‘Derivative Level' = 0$ or $1$ , unassigned elements of $c j a c$ are not treated as constant; they are estimated by finite differences, at nontrivial expense.

If you do not supply a value for the option ‘Difference Interval’, an interval for each element of $x$ is computed automatically at the start of the optimization.

The automatic procedure can usually identify constant elements of $c j a c$ , which are then computed once only by finite differences.

dataarbitrary, optional

User-communication data for callback functions.

io_managerFileObjManager, optional

Manager for I/O in this routine.

spiked_sorderstr, optional

If $a$ and $h$ are spiked (i.e., have unit extent in all but one dimension, or have size $1$ ), $s p i k e d_s o r d e r$ selects the storage order to associate with them in the NAG Engine:

spiked_sorder = $'C'$: row-major storage will be used;
spiked_sorder = $'F'$: column-major storage will be used.

Two-dimensional arrays returned from callback functions in this routine must then use the same storage order.

Returns

majitsint

The number of major iterations performed.

istateint, ndarray, shape $(n + nclin + ncnln)$

Describes the status of the constraints $l \leq r (x) \leq u$ . For the $j$ th lower or upper bound, $j = 1, 2, \dots, n + nclin + ncnln$ , the possible values of $i s t a t e [j - 1]$ are as follows (see Figure [label omitted]). $δ$ is the appropriate feasibility tolerance.

$- 2$	(Region 1) The lower bound is violated by more than $δ$ .
$- 1$	(Region 5) The upper bound is violated by more than $δ$ .
$0$	(Region 3) Both bounds are satisfied by more than $δ$ .
$1$	(Region 2) The lower bound is active (to within $δ$ ).
$2$	(Region 4) The upper bound is active (to within $δ$ ).
$3$	( $Region 2 = Region 4$ ) The bounds are equal and the equality constraint is satisfied (to within $δ$ ).

These values of $i s t a t e$ are labelled in the printed solution according to Table [label omitted].

Region	$1$	$2$	$3$	$4$	$5$	$2 \equiv 4$
$i s t a t e [j - 1]$	$- 2$	$1$	$0$	$2$	$- 1$	$3$
Printed solution	–	LL	FR	UL	++	EQ

cconfloat, ndarray, shape $(max (1, ncnln))$

If $ncnln > 0$ , $c c o n [i - 1]$ contains the value of the $i$ th nonlinear constraint function $c_{i}$ at the final iterate, for $i = 1, 2, \dots, ncnln$ .

If $ncnln = 0$ , the array $c c o n$ is not referenced.

cjacfloat, ndarray, shape $(ncnln, :)$

If $ncnln > 0$ , $c j a c$ contains the Jacobian matrix of the nonlinear constraint functions at the final iterate, i.e., $c j a c [i - 1, j - 1]$ contains the partial derivative of the $i$ th constraint function with respect to the $j$ th variable, for $j = 1, 2, \dots, n$ , for $i = 1, 2, \dots, ncnln$ . (See the discussion of argument $c j a c$ under $c o n f u n$ .)

If $ncnln = 0$ , the array $c j a c$ is not referenced.

clamdafloat, ndarray, shape $(n + nclin + ncnln)$

The values of the QP multipliers from the last QP subproblem. $c l a m d a [j - 1]$ should be non-negative if $i s t a t e [j - 1] = 1$ and non-positive if $i s t a t e [j - 1] = 2$ .

objffloat

The value of the objective function at the final iterate.

gradfloat, ndarray, shape $(n)$

The gradient of the objective function (or its finite difference approximation) at the final iterate.

hfloat, ndarray, shape $(n, n)$

Contains the Hessian of the Lagrangian at the final estimate $x$ .

xfloat, ndarray, shape $(n)$

The final estimate of the solution.

Other Parameters

‘Central Difference Interval’float

Default $= ϵ_{r}^{\frac{1}{3}}$

When $‘Derivative Level' < 3$ , the central-difference interval $r$ is used near an optimal solution to obtain more accurate (but more expensive) estimates of gradients. Twice as many function evaluations are required compared to forward differencing. The interval used for the $j$ th variable is $h_{j} = r (1 + ∣ ∣ x_{j} ∣ ∣)$ . The resulting derivative estimates should be accurate to $O (r^{2})$ , unless the functions are badly scaled.

If you supply a value for this option, a small value between $0.0$ and $1.0$ is appropriate.

‘Check Frequency’int

Default $= 60$

Every $i$ th minor iteration after the most recent basis factorization, a numerical test is made to see if the current solution $x$ satisfies the general linear constraints (the linear constraints and the linearized nonlinear constraints, if any). The constraints are of the form $A x - s = b$ , where $s$ is the set of slack variables. To perform the numerical test, the residual vector $r = b - A x + s$ is computed. If the largest component of $r$ is judged to be too large, the current basis is refactorized and the basic variables are recomputed to satisfy the general constraints more accurately. If $i \leq 0$ , the value of $i = 99999999$ is used and effectively no checks are made.

$‘Check Frequency' = 1$ is useful for debugging purposes, but otherwise this option should not be needed.

‘Cold Start’valueless

Default

This option controls the specification of the initial working set in the procedure for finding a feasible point for the linear constraints and bounds and in the first QP subproblem thereafter. With a ‘Cold Start’, the first working set is chosen by nlp2_solve based on the values of the variables and constraints at the initial point. Broadly speaking, the initial working set will include equality constraints and bounds or inequality constraints that violate or ‘nearly’ satisfy their bounds (to within ‘Crash Tolerance’).

With a ‘Warm Start’, you must set the $i s t a t e$ array and define $c l a m d a$ and $h$ as discussed in Parameters. $i s t a t e$ values associated with bounds and linear constraints determine the initial working set of the procedure to find a feasible point with respect to the bounds and linear constraints. $i s t a t e$ values associated with nonlinear constraints determine the initial working set of the first QP subproblem after such a feasible point has been found. nlp2_solve will override your specification of $i s t a t e$ if necessary, so that a poor choice of the working set will not cause a fatal error. For instance, any elements of $i s t a t e$ which are set to $- 2$ , $- 1$ or $4$ will be reset to zero, as will any elements which are set to $3$ when the corresponding elements of $b l$ and $b u$ are not equal. A warm start will be advantageous if a good estimate of the initial working set is available – for example, when nlp2_solve is called repeatedly to solve related problems.

‘Warm Start’valueless

This option controls the specification of the initial working set in the procedure for finding a feasible point for the linear constraints and bounds and in the first QP subproblem thereafter. With a ‘Cold Start’, the first working set is chosen by nlp2_solve based on the values of the variables and constraints at the initial point. Broadly speaking, the initial working set will include equality constraints and bounds or inequality constraints that violate or ‘nearly’ satisfy their bounds (to within ‘Crash Tolerance’).

With a ‘Warm Start’, you must set the $i s t a t e$ array and define $c l a m d a$ and $h$ as discussed in Parameters. $i s t a t e$ values associated with bounds and linear constraints determine the initial working set of the procedure to find a feasible point with respect to the bounds and linear constraints. $i s t a t e$ values associated with nonlinear constraints determine the initial working set of the first QP subproblem after such a feasible point has been found. nlp2_solve will override your specification of $i s t a t e$ if necessary, so that a poor choice of the working set will not cause a fatal error. For instance, any elements of $i s t a t e$ which are set to $- 2$ , $- 1$ or $4$ will be reset to zero, as will any elements which are set to $3$ when the corresponding elements of $b l$ and $b u$ are not equal. A warm start will be advantageous if a good estimate of the initial working set is available – for example, when nlp2_solve is called repeatedly to solve related problems.

‘Crash Option’int

Default $= 3$

If a ‘Cold Start’ is specified, an internal Crash procedure is used to select an initial basis from certain rows and columns of the constraint matrix $(\begin{matrix} A & - I \end{matrix})$ . The option ‘Crash Option’ $i$ determines which rows and columns of $A$ are eligible initially, and how many times the Crash procedure is called. Columns of $- I$ are used to pad the basis where necessary.

$i$	Meaning
$0$	The initial basis contains only slack variables: $B = I$ .
$1$	The Crash procedure is called once, looking for a triangular basis in all rows and columns of $A$ .
$2$	The Crash procedure is called twice (if there are nonlinear constraints). The first call looks for a triangular basis in linear rows, and the iteration proceeds with simplex iterations until the linear constraints are satisfied. The Jacobian is then evaluated for the first major iteration and the Crash procedure is called again to find a triangular basis in the nonlinear rows (retaining the current basis for linear rows).
$3$	The Crash procedure is called up to three times (if there are nonlinear constraints). The first two calls treat linear equalities and linear inequalities separately. As before, the last call treats nonlinear rows before the first major iteration.

If $i \geq 1$ , certain slacks on inequality rows are selected for the basis first. (If $i \geq 2$ , numerical values are used to exclude slacks that are close to a bound). The Crash procedure then makes several passes through the columns of $A$ , searching for a basis matrix that is essentially triangular. A column is assigned to ‘pivot’ on a particular row if the column contains a suitably large element in a row that has not yet been assigned. (The pivot elements ultimately form the diagonals of the triangular basis.) For remaining unassigned rows, slack variables are inserted to complete the basis.

The ‘Crash Tolerance’ $r$ allows the starting Crash procedure to ignore certain ‘small’ nonzeros in each column of $A$ . If $a_{m a x}$ is the largest element in column $j$ , other nonzeros of $a_{i j}$ in the columns are ignored if $∣ ∣ a_{i j} ∣ ∣ \leq a_{m a x} \times r$ . (To be meaningful, $r$ must be in the range $0 \leq r < 1$ .)

When $r > 0.0$ , the basis obtained by the Crash procedure may not be strictly triangular, but it is likely to be nonsingular and almost triangular. The intention is to obtain a starting basis containing more columns of $A$ and fewer (arbitrary) slacks. A feasible solution may be reached sooner on some problems.

For example, suppose the first $m$ columns of $A$ form the matrix shown under ‘LU Factor Tolerance’; i.e., a tridiagonal matrix with entries $- 1$ , $4$ , $- 1$ . To help the Crash procedure choose all $m$ columns for the initial basis, we would specify a ‘Crash Tolerance’ of $r$ for some value of $r > 0.5$ .

‘Crash Tolerance’float

Default $= 0.1$

If a ‘Cold Start’ is specified, an internal Crash procedure is used to select an initial basis from certain rows and columns of the constraint matrix $(\begin{matrix} A & - I \end{matrix})$ . The option ‘Crash Option’ $i$ determines which rows and columns of $A$ are eligible initially, and how many times the Crash procedure is called. Columns of $- I$ are used to pad the basis where necessary.

$i$	Meaning
$0$	The initial basis contains only slack variables: $B = I$ .
$1$	The Crash procedure is called once, looking for a triangular basis in all rows and columns of $A$ .
$2$	The Crash procedure is called twice (if there are nonlinear constraints). The first call looks for a triangular basis in linear rows, and the iteration proceeds with simplex iterations until the linear constraints are satisfied. The Jacobian is then evaluated for the first major iteration and the Crash procedure is called again to find a triangular basis in the nonlinear rows (retaining the current basis for linear rows).
$3$	The Crash procedure is called up to three times (if there are nonlinear constraints). The first two calls treat linear equalities and linear inequalities separately. As before, the last call treats nonlinear rows before the first major iteration.

If $i \geq 1$ , certain slacks on inequality rows are selected for the basis first. (If $i \geq 2$ , numerical values are used to exclude slacks that are close to a bound). The Crash procedure then makes several passes through the columns of $A$ , searching for a basis matrix that is essentially triangular. A column is assigned to ‘pivot’ on a particular row if the column contains a suitably large element in a row that has not yet been assigned. (The pivot elements ultimately form the diagonals of the triangular basis.) For remaining unassigned rows, slack variables are inserted to complete the basis.

The ‘Crash Tolerance’ $r$ allows the starting Crash procedure to ignore certain ‘small’ nonzeros in each column of $A$ . If $a_{m a x}$ is the largest element in column $j$ , other nonzeros of $a_{i j}$ in the columns are ignored if $∣ ∣ a_{i j} ∣ ∣ \leq a_{m a x} \times r$ . (To be meaningful, $r$ must be in the range $0 \leq r < 1$ .)

When $r > 0.0$ , the basis obtained by the Crash procedure may not be strictly triangular, but it is likely to be nonsingular and almost triangular. The intention is to obtain a starting basis containing more columns of $A$ and fewer (arbitrary) slacks. A feasible solution may be reached sooner on some problems.

For example, suppose the first $m$ columns of $A$ form the matrix shown under ‘LU Factor Tolerance’; i.e., a tridiagonal matrix with entries $- 1$ , $4$ , $- 1$ . To help the Crash procedure choose all $m$ columns for the initial basis, we would specify a ‘Crash Tolerance’ of $r$ for some value of $r > 0.5$ .

‘Defaults’valueless

This special keyword may be used to reset all options to their default values.

‘Derivative Level’int

Default $= 3$

Option ‘Derivative Level’ specifies which nonlinear function gradients are known analytically and will be supplied to nlp2_solve by functions $o b j f u n$ and $c o n f u n$ .

$i$	Meaning
$3$	All objective and constraint gradients are known.
$2$	All constraint gradients are known, but some or all components of the objective gradient are unknown.
$1$	The objective gradient is known, but some or all of the constraint gradients are unknown.
$0$	Some components of the objective gradient are unknown and some of the constraint gradients are unknown.

The value $i = 3$ should be used whenever possible. It is the most reliable and will usually be the most efficient.

If $i = 0$ or $2$ , nlp2_solve will estimate the missing components of the objective gradient, using finite differences. This may simplify the coding of $o b j f u n$ . However, it could increase the total run-time substantially (since a special call to $o b j f u n$ is required for each missing element), and there is less assurance that an acceptable solution will be located. If the nonlinear variables are not well scaled, it may be necessary to specify a non-default option ‘Difference Interval’.

If $i = 0$ or $1$ , nlp2_solve will estimate missing elements of the Jacobian. For each column of the Jacobian, one call to $c o n f u n$ is needed to estimate all missing elements in that column, if any.

At times, central differences are used rather than forward differences. (This is not under your control.)

‘Derivative Linesearch’valueless

Default

At each major iteration a linesearch is used to improve the merit function. Option ‘Derivative Linesearch’ uses safeguarded cubic interpolation and requires both function and gradient values to compute estimates of the step $α_{k}$ . If some analytic derivatives are not provided, or option ‘Nonderivative Linesearch’ is specified, nlp2_solve employs a linesearch based upon safeguarded quadratic interpolation, which does not require gradient evaluations.

A nonderivative linesearch can be slightly less robust on difficult problems, and it is recommended that the default be used if the functions and derivatives can be computed at approximately the same cost. If the gradients are very expensive relative to the functions, a nonderivative linesearch may give a significant decrease in computation time.

If ‘Nonderivative Linesearch’ is selected, nlp2_solve signals the evaluation of the linesearch by calling $o b j f u n$ with $m o d e = 0$ . If the potential saving provided by a nonderivative linesearch is to be realised, it is essential that $o b j f u n$ be coded so that derivatives are not computed when $m o d e = 0$ .

‘Nonderivative Linesearch’valueless

At each major iteration a linesearch is used to improve the merit function. Option ‘Derivative Linesearch’ uses safeguarded cubic interpolation and requires both function and gradient values to compute estimates of the step $α_{k}$ . If some analytic derivatives are not provided, or option ‘Nonderivative Linesearch’ is specified, nlp2_solve employs a linesearch based upon safeguarded quadratic interpolation, which does not require gradient evaluations.

A nonderivative linesearch can be slightly less robust on difficult problems, and it is recommended that the default be used if the functions and derivatives can be computed at approximately the same cost. If the gradients are very expensive relative to the functions, a nonderivative linesearch may give a significant decrease in computation time.

If ‘Nonderivative Linesearch’ is selected, nlp2_solve signals the evaluation of the linesearch by calling $o b j f u n$ with $m o d e = 0$ . If the potential saving provided by a nonderivative linesearch is to be realised, it is essential that $o b j f u n$ be coded so that derivatives are not computed when $m o d e = 0$ .

‘Difference Interval’float

Default $= \sqrt{ϵ_{r}}$

This alters the interval $r$ used to estimate gradients by forward differences. It does so in the following circumstances:

in the interval (‘cheap’) phase of verifying the problem derivatives;
for verifying the problem derivatives;
for estimating missing derivatives.

In all cases, a derivative with respect to $x_{j}$ is estimated by perturbing that component of $x$ to the value $x_{j} + r (1 + ∣ ∣ x_{j} ∣ ∣)$ , and then evaluating $F (x)$ or $c (x)$ at the perturbed point. The resulting gradient estimates should be accurate to $O (r)$ unless the functions are badly scaled. Judicious alteration of $r$ may sometimes lead to greater accuracy.

If you supply a value for this option, a small value between $0.0$ and $1.0$ is appropriate.

‘Dump File’int

Default $= 0$

Options ‘Dump File’ and ‘Load File’ are similar to options ‘Punch File’ and ‘Insert File’, but they record solution information in a manner that is more direct and more easily modified. A full description of information recorded in options ‘Dump File’ and ‘Load File’ is given in Gill et al. (2005a).

If $i_{1} > 0$ , the last solution obtained will be output to the file with unit number $i_{1}$ .

If $i_{2} > 0$ , the ‘Load File’, containing basis information, will be read. The file will usually have been output previously as a ‘Dump File’. The file will not be accessed if options ‘Old Basis File’ or ‘Insert File’ are specified.

‘Load File’int

Default $= 0$

Options ‘Dump File’ and ‘Load File’ are similar to options ‘Punch File’ and ‘Insert File’, but they record solution information in a manner that is more direct and more easily modified. A full description of information recorded in options ‘Dump File’ and ‘Load File’ is given in Gill et al. (2005a).

If $i_{1} > 0$ , the last solution obtained will be output to the file with unit number $i_{1}$ .

If $i_{2} > 0$ , the ‘Load File’, containing basis information, will be read. The file will usually have been output previously as a ‘Dump File’. The file will not be accessed if options ‘Old Basis File’ or ‘Insert File’ are specified.

‘Elastic Weight’float

Default $= 10^{4}$

This keyword determines the initial weight $γ$ associated with the problem (1) (see Treatment of Constraint Infeasibilities).

At major iteration $k$ , if elastic mode has not yet started, a scale factor $σ_{k} = 1 + {∥ g (x_{k}) ∥}_{\infty}$ is defined from the current objective gradient. Elastic mode is then started if the QP subproblem is infeasible, or the QP dual variables are larger in magnitude than $σ_{k} r$ . The QP is resolved in elastic mode with $γ = σ_{k} r$ .

Thereafter, major iterations continue in elastic mode until they converge to a point that is optimal for (1) (see Treatment of Constraint Infeasibilities). If the point is feasible for equation (1) $(v = w = 0)$ , it is declared locally optimal. Otherwise, $γ$ is increased by a factor of $10$ and major iterations continue. If $γ$ has already reached a maximum allowable value, equation (1) is declared locally infeasible.

‘Expand Frequency’int

Default $= 10000$

This option is part of the anti-cycling procedure designed to make progress even on highly degenerate problems.

For linear models, the strategy is to force a positive step at every iteration, at the expense of violating the bounds on the variables by a small amount. Suppose that the option ‘Minor Feasibility Tolerance’ is $δ$ . Over a period of $i$ iterations, the tolerance actually used by nlp2_solve increases from $0.5 δ$ to $δ$ (in steps of $0.5 δ / i$ ).

For nonlinear models, the same procedure is used for iterations in which there is only one superbasic variable. (Cycling can occur only when the current solution is at a vertex of the feasible region.) Thus, zero steps are allowed if there is more than one superbasic variable, but otherwise positive steps are enforced.

Increasing $i$ helps reduce the number of slightly infeasible nonbasic variables (most of which are eliminated during a resetting procedure). However, it also diminishes the freedom to choose a large pivot element (see option ‘Pivot Tolerance’).

‘Factorization Frequency’int

Default $= 50$

At most $i$ basis changes will occur between factorizations of the basis matrix.

With linear programs, the basis factors are usually updated every iteration. The default $i$ is reasonable for typical problems. Higher values up to $i = 100$ (say) may be more efficient on well-scaled problems.

When the objective function is nonlinear, fewer basis updates will occur as an optimum is approached. The number of iterations between basis factorizations will, therefore, increase. During these iterations a test is made regularly (according to the option ‘Check Frequency’) to ensure that the general constraints are satisfied. If necessary the basis will be refactorized before the limit of $i$ updates is reached.

‘Function Precision’float

Default $= ϵ^{0.8}$

The relative function precision $ϵ_{r}$ is intended to be a measure of the relative accuracy with which the functions can be computed. For example, if $F (x)$ is computed as $1000.56789$ for some relevant $x$ and if the first $6$ significant digits are known to be correct, the appropriate value for $ϵ_{r}$ would be $1.0 e - 6$ .

(Ideally the functions $F (x)$ or $c_{i} (x)$ should have magnitude of order $1$ . If all functions are substantially less than $1$ in magnitude, $ϵ_{r}$ should be the absolute precision. For example, if $F (x) = 1.23456789 e - 4$ at some point and if the first $6$ significant digits are known to be correct, the appropriate value for $ϵ_{r}$ would be $1.0 e - 10$ .)

The default value of $ϵ_{r}$ is appropriate for simple analytic functions.

In some cases the function values will be the result of extensive computation, possibly involving a costly iterative procedure that can provide few digits of precision. Specifying an appropriate ‘Function Precision’ may lead to savings, by allowing the linesearch procedure to terminate when the difference between function values along the search direction becomes as small as the absolute error in the values.

‘Hessian Full Memory’valueless

Default if $n \leq 75$

These options select the method for storing and updating the approximate Hessian. (nlp2_solve uses a quasi-Newton approximation to the Hessian of the Lagrangian. A BFGS update is applied after each major iteration.)

If ‘Hessian Full Memory’ is specified, the approximate Hessian is treated as a dense matrix and the BFGS updates are applied explicitly. This option is most efficient when the number of variables $n$ is not too large (say, less than $75$ ). In this case, the storage requirement is fixed and one can expect $1$ -step Q-superlinear convergence to the solution.

‘Hessian Limited Memory’ should be used on problems where $n$ is very large. In this case a limited-memory procedure is used to update a diagonal Hessian approximation $H_{r}$ a limited number of times. (Updates are accumulated as a list of vector pairs. They are discarded at regular intervals after $H_{r}$ has been reset to their diagonal.)

‘Hessian Limited Memory’valueless

Default if $n > 75$

These options select the method for storing and updating the approximate Hessian. (nlp2_solve uses a quasi-Newton approximation to the Hessian of the Lagrangian. A BFGS update is applied after each major iteration.)

If ‘Hessian Full Memory’ is specified, the approximate Hessian is treated as a dense matrix and the BFGS updates are applied explicitly. This option is most efficient when the number of variables $n$ is not too large (say, less than $75$ ). In this case, the storage requirement is fixed and one can expect $1$ -step Q-superlinear convergence to the solution.

‘Hessian Limited Memory’ should be used on problems where $n$ is very large. In this case a limited-memory procedure is used to update a diagonal Hessian approximation $H_{r}$ a limited number of times. (Updates are accumulated as a list of vector pairs. They are discarded at regular intervals after $H_{r}$ has been reset to their diagonal.)

‘Hessian Frequency’int

Default $= 99999999$

If option ‘Hessian Full Memory’ is in effect and $i$ BFGS updates have already been carried out, the Hessian approximation is reset to the identity matrix. (For certain problems, occasional resets may improve convergence, but in general they should not be necessary.)

‘Hessian Full Memory’ and $‘Hessian Frequency' = 10$ have a similar effect to ‘Hessian Limited Memory’ and $‘Hessian Updates' = 10$ (except that the latter retains the current diagonal during resets).

‘Hessian Updates’int

Default $= ‘Hessian Frequency'$ if ‘Hessian Full Memory’, $10$ otherwise

If option ‘Hessian Limited Memory’ is in effect and $i$ BFGS updates have already been carried out, all but the diagonal elements of the accumulated updates are discarded and the updating process starts again.

Broadly speaking, the more updates stored, the better the quality of the approximate Hessian. However, the more vectors stored, the greater the cost of each QP iteration. The default value is likely to give a robust algorithm without significant expense, but faster convergence can sometimes be obtained with significantly fewer updates (e.g., $i = 5$ ).

‘Infinite Bound Size’float

Default $= 10^{20}$

If $r > 0$ , $r$ defines the ‘infinite’ bound $bigbnd$ in the definition of the problem constraints. Any upper bound greater than or equal to $bigbnd$ will be regarded as $+ \infty$ (and similarly any lower bound less than or equal to $- bigbnd$ will be regarded as $- \infty$ ). If $r < 0$ , the default value is used.

‘Iterations Limit’int

Default $= m a x (10000, 10 m a x (n, n_{L} + n_{N}))$

The value of $i$ specifies the maximum number of minor iterations allowed (i.e., iterations of the simplex method or the QP algorithm), summed over all major iterations. (See also the description of the option ‘Minor Iterations Limit’.)

‘Linesearch Tolerance’float

Default $= 0.9$

This tolerance, $r$ , controls the accuracy with which a step length will be located along the direction of search each iteration. At the start of each linesearch a target directional derivative for the merit function is identified. This argument determines the accuracy to which this target value is approximated, and it must be a value in the range $0.0 \leq r \leq 1.0$ .

The default value $r = 0.9$ requests just moderate accuracy in the linesearch.

If the nonlinear functions are cheap to evaluate, a more accurate search may be appropriate; try $r = 0.1, 0.01 or 0.001$ .

If the nonlinear functions are expensive to evaluate, a less accurate search may be appropriate. If all gradients are known, try $r = 0.99$ . (The number of major iterations might increase, but the total number of function evaluations may decrease enough to compensate.)

If not all gradients are known, a moderately accurate search remains appropriate. Each search will require only $1$ –5 function values (typically), but many function calls will then be needed to estimate missing gradients for the next iteration.

‘Nolist’valueless

Default

Option ‘List’ enables printing of each option specification as it is supplied. ‘Nolist’ suppresses this printing.

‘List’valueless

Option ‘List’ enables printing of each option specification as it is supplied. ‘Nolist’ suppresses this printing.

‘LU Density Tolerance’float

Default $= 0.6$

The density tolerance, $r_{1}$ , is used during $L U$ factorization of the basis matrix $B$ . Columns of $L$ and rows of $U$ are formed one at a time, and the remaining rows and columns of the basis are altered appropriately. At any stage, if the density of the remaining matrix exceeds $r_{1}$ , the Markowitz strategy for choosing pivots is terminated, and the remaining matrix is factored by a dense $L U$ procedure. Raising the density tolerance towards $1.0$ may give slightly sparser $L U$ factors, with a slight increase in factorization time.

The singularity tolerance, $r_{2}$ , helps guard against ill-conditioned basis matrices. After $B$ is refactorized, the diagonal elements of $U$ are tested as follows: if $∣ ∣ u_{j j} ∣ ∣ \leq r_{2}$ or $∣ ∣ u_{j j} ∣ ∣ < r_{2} {m a x}_{i} ∣ ∣ u_{i j} ∣ ∣$ , the $j$ th column of the basis is replaced by the corresponding slack variable. (This is most likely to occur after a restart.)

‘LU Singularity Tolerance’float

Default $= ϵ^{\frac{2}{3}}$

The density tolerance, $r_{1}$ , is used during $L U$ factorization of the basis matrix $B$ . Columns of $L$ and rows of $U$ are formed one at a time, and the remaining rows and columns of the basis are altered appropriately. At any stage, if the density of the remaining matrix exceeds $r_{1}$ , the Markowitz strategy for choosing pivots is terminated, and the remaining matrix is factored by a dense $L U$ procedure. Raising the density tolerance towards $1.0$ may give slightly sparser $L U$ factors, with a slight increase in factorization time.

The singularity tolerance, $r_{2}$ , helps guard against ill-conditioned basis matrices. After $B$ is refactorized, the diagonal elements of $U$ are tested as follows: if $∣ ∣ u_{j j} ∣ ∣ \leq r_{2}$ or $∣ ∣ u_{j j} ∣ ∣ < r_{2} {m a x}_{i} ∣ ∣ u_{i j} ∣ ∣$ , the $j$ th column of the basis is replaced by the corresponding slack variable. (This is most likely to occur after a restart.)

‘LU Factor Tolerance’float

Default $= 1.10$

The values of $r_{1}$ and $r_{2}$ affect the stability of the basis factorization $B = L U$ , during refactorization and updates respectively. The lower triangular matrix $L$ is a product of matrices of the form

\begin{matrix} (\begin{matrix} 1 μ & 1 \end{matrix}) \end{matrix}

where the multipliers $μ$ will satisfy $| μ | \leq r_{i}$ . The default values of $r_{1}$ and $r_{2}$ usually strike a good compromise between stability and sparsity. They must satisfy $r_{1}$ , $r_{2} \geq 1.0$ .

For large and relatively dense problems, $r_{1} = 10.0$ or $5.0$ (say) may give a useful improvement in stability without impairing sparsity to a serious degree.

For certain very regular structures (e.g., band matrices) it may be necessary to reduce $r_{1} and/or r_{2}$ in order to achieve stability. For example, if the columns of $A$ include a sub-matrix of the form

\begin{matrix} ⎛ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎝ \begin{matrix} 4 & - 1 - 1 & 4 & - 1 - 1 & 4 & - 1 \dots & \dots & \dots - 1 & 4 & - 1 - 1 & 4 \end{matrix} ⎞ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎠, \end{matrix}

one should set both $r_{1}$ and $r_{2}$ to values in the range $1.0 \leq r_{i} < 4.0$ .

‘LU Update Tolerance’float

Default $= 1.10$

The values of $r_{1}$ and $r_{2}$ affect the stability of the basis factorization $B = L U$ , during refactorization and updates respectively. The lower triangular matrix $L$ is a product of matrices of the form

\begin{matrix} (\begin{matrix} 1 μ & 1 \end{matrix}) \end{matrix}

where the multipliers $μ$ will satisfy $| μ | \leq r_{i}$ . The default values of $r_{1}$ and $r_{2}$ usually strike a good compromise between stability and sparsity. They must satisfy $r_{1}$ , $r_{2} \geq 1.0$ .

For large and relatively dense problems, $r_{1} = 10.0$ or $5.0$ (say) may give a useful improvement in stability without impairing sparsity to a serious degree.

For certain very regular structures (e.g., band matrices) it may be necessary to reduce $r_{1} and/or r_{2}$ in order to achieve stability. For example, if the columns of $A$ include a sub-matrix of the form

\begin{matrix} ⎛ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎝ \begin{matrix} 4 & - 1 - 1 & 4 & - 1 - 1 & 4 & - 1 \dots & \dots & \dots - 1 & 4 & - 1 - 1 & 4 \end{matrix} ⎞ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎠, \end{matrix}

one should set both $r_{1}$ and $r_{2}$ to values in the range $1.0 \leq r_{i} < 4.0$ .

‘LU Partial Pivoting’valueless

Default

The $L U$ factorization implements a Markowitz-type search for pivots that locally minimize the fill-in subject to a threshold pivoting stability criterion. The default option is to use threshhold partial pivoting. The options ‘LU Rook Pivoting’ and ‘LU Complete Pivoting’ are more expensive than partial pivoting but are more stable and better at revealing rank, as long as ‘LU Factor Tolerance’ is not too large (say $< 2.0$ ). When numerical difficulties are encountered, nlp2_solve automatically reduces the $L U$ tolerance towards $1.0$ and switches (if necessary) to rook or complete pivoting, before reverting to the default or specified options at the next refactorization (with ‘System Information Yes’, relevant messages are output to the ‘Print File’).

‘LU Complete Pivoting’valueless

The $L U$ factorization implements a Markowitz-type search for pivots that locally minimize the fill-in subject to a threshold pivoting stability criterion. The default option is to use threshhold partial pivoting. The options ‘LU Rook Pivoting’ and ‘LU Complete Pivoting’ are more expensive than partial pivoting but are more stable and better at revealing rank, as long as ‘LU Factor Tolerance’ is not too large (say $< 2.0$ ). When numerical difficulties are encountered, nlp2_solve automatically reduces the $L U$ tolerance towards $1.0$ and switches (if necessary) to rook or complete pivoting, before reverting to the default or specified options at the next refactorization (with ‘System Information Yes’, relevant messages are output to the ‘Print File’).

‘LU Rook Pivoting’valueless

The $L U$ factorization implements a Markowitz-type search for pivots that locally minimize the fill-in subject to a threshold pivoting stability criterion. The default option is to use threshhold partial pivoting. The options ‘LU Rook Pivoting’ and ‘LU Complete Pivoting’ are more expensive than partial pivoting but are more stable and better at revealing rank, as long as ‘LU Factor Tolerance’ is not too large (say $< 2.0$ ). When numerical difficulties are encountered, nlp2_solve automatically reduces the $L U$ tolerance towards $1.0$ and switches (if necessary) to rook or complete pivoting, before reverting to the default or specified options at the next refactorization (with ‘System Information Yes’, relevant messages are output to the ‘Print File’).

‘Major Feasibility Tolerance’float

Default $= m a x (10^{- 6}, \sqrt{ϵ})$

This tolerance, $r$ , specifies how accurately the nonlinear constraints should be satisfied. The default value is appropriate when the linear and nonlinear constraints contain data to about that accuracy.

Let $v_{m a x}$ be the maximum nonlinear constraint violation, normalized by the size of the solution, which is required to satisfy

v_{m a x /} = {m a x}_{i} v_{i} / ∥ x ∥ \leq r,

where $v_{i}$ is the violation of the $i$ th nonlinear constraint $(i = 1 : n_{L})$ .

In the major iteration log (see Minor Iteration Log, $v_{m a x}$ appears as the quantity labelled ‘Feasible’. If some of the problem functions are known to be of low accuracy, a larger ‘Major Feasibility Tolerance’ may be appropriate.

‘Major Optimality Tolerance’float

Default $= 2 m a x (10^{- 6}, \sqrt{ϵ})$

This tolerance, $r$ , specifies the final accuracy of the dual variables. On successful termination, nlp2_solve will have computed a solution $(x, s, π)$ such that

c_{m a x} = {m a x}_{j} c_{j} / ∥ π ∥ \leq r,

where $c_{j}$ is an estimate of the complementarity slackness for variable $j$ where $j = 1 : n + n_{L} + n_{N}$ . The values $c_{i}$ are computed from the final QP solution using the reduced gradients $d_{j} = g_{j} - π^{T} a_{j}$ (where $g_{j}$ is the $j$ th component of the objective gradient, $a_{j}$ is the associated column of the constraint matrix $(\begin{matrix} A & - I \end{matrix})$ , and $π$ is the set of QP dual variables):

\begin{matrix} c_{j} = {\begin{matrix} d_{j} m i n (x_{j} - l_{j}, 1) & if d_{j} \geq 0; - d_{j} m i n (u_{j} - x_{j}, 1) & if d_{j} < 0 . \end{matrix}) \end{matrix}

In the ‘Print File’, $c_{m a x}$ appears as the quantity labelled ‘Optimal’.

‘Major Iterations Limit’int

Default $= m a x (1000, 3 m a x (n, n_{L} + n_{N}))$

This is the maximum number of major iterations allowed. It is intended to guard against an excessive number of linearizations of the constraints. If $i = 0$ , optimality and feasibility are checked.

‘Major Print Level’int

Default $= 000001$

This controls the amount of output to the options ‘Print File’ and ‘Summary File’ at each major iteration. $‘Major Print Level' = 0$ suppresses most output, except for error messages. $‘Major Print Level' = 1$ gives normal output for linear and nonlinear problems, and $‘Major Print Level' = 11$ gives additional details of the Jacobian factorization that commences each major iteration.

In general, the value being specified may be thought of as a binary number of the form

‘Major Print Level' J F D X b s

where each letter stands for a digit that is either $0$ or $1$ as follows:

$s$	a single line that gives a summary of each major iteration. (This entry in $J F D X b s$ is not strictly binary since the summary line is printed whenever $J F D X b s \geq 1$ );
$b$	basis statistics, i.e., information relating to the basis matrix whenever it is refactorized. (This output is always provided if $J F D X b s \geq 10$ );
$X$	$x_{k}$ , the nonlinear variables involved in the objective function or the constraints. These appear under the heading ‘Jacobian variables’;
$D$	$π_{k}$ , the dual variables for the nonlinear constraints. These appear under the heading ‘Multiplier estimates’;
$F$	$c (x_{k})$ , the values of the nonlinear constraint functions;
$J$	$J (x_{k})$ , the Jacobian matrix. This appears under the heading ‘ $x$ and Jacobian’.

To obtain output of any items $J F D X b s$ , set the corresponding digit to $1$ , otherwise to $0$ .

If $J = 1$ , the Jacobian matrix will be output column-wise at the start of each major iteration. Column $j$ will be preceded by the value of the corresponding variable $x_{j}$ and a key to indicate whether the variable is basic, superbasic or nonbasic. (Hence if $J = 1$ , there is no reason to specify $X = 1$ unless the objective contains more nonlinear variables than the Jacobian.) A typical line of output is

3 1.250000e+01 BS 1 1.00000e+00 4 2.00000e+00

which would mean that $x_{3}$ is basic at value $12.5$ , and the third column of the Jacobian has elements of $1.0$ and $2.0$ in rows $1$ and $4$ .

‘Major Step Limit’float

Default $= 2.0$

This argument limits the change in $x$ during a linesearch. It applies to all nonlinear problems, once a ‘feasible solution’ or ‘feasible subproblem’ has been found. A linesearch determines a step $α$ over the range $0 < α \leq β$ , where $β$ is $1$ if there are nonlinear constraints, or is the step to the nearest upper or lower bound on $x$ if all the constraints are linear. Normally, the first step length tried is $α_{1} = m i n (1, β)$ .

In some cases, such as $f (x) = a e^{b x}$ or $f (x) = a x^{b}$ , even a moderate change in the components of $x$ can lead to floating-point overflow. The argument $r$ is, therefore, used to define a limit $¯ β = r (1 + ∥ x ∥) / ∥ p ∥$ (where $p$ is the search direction), and the first evaluation of $f (x)$ is at the potentially smaller step length $α_{1} = m i n (1, ¯ β, β)$ .
Wherever possible, upper and lower bounds on $x$ should be used to prevent evaluation of nonlinear functions at meaningless points. The option ‘Major Step Limit’ provides an additional safeguard. The default value $r = 2.0$ should not affect progress on well behaved problems, but setting $r = 0.1 or 0.01$ may be helpful when rapidly varying functions are present. A ‘good’ starting point may be required. An important application is to the class of nonlinear least squares problems.
In cases where several local optima exist, specifying a small value for $r$ may help locate an optimum near the starting point.

‘Minimize’valueless

Default

The keywords ‘Minimize’ and ‘Maximize’ specify the required direction of optimization. It applies to both linear and nonlinear terms in the objective.

The keyword ‘Feasible Point’ means ‘Ignore the objective function, while finding a feasible point for the linear and nonlinear constraints’. It can be used to check that the nonlinear constraints are feasible without altering the call to nlp2_solve.

‘Maximize’valueless

The keywords ‘Minimize’ and ‘Maximize’ specify the required direction of optimization. It applies to both linear and nonlinear terms in the objective.

The keyword ‘Feasible Point’ means ‘Ignore the objective function, while finding a feasible point for the linear and nonlinear constraints’. It can be used to check that the nonlinear constraints are feasible without altering the call to nlp2_solve.

‘Feasible Point’valueless

The keywords ‘Minimize’ and ‘Maximize’ specify the required direction of optimization. It applies to both linear and nonlinear terms in the objective.

The keyword ‘Feasible Point’ means ‘Ignore the objective function, while finding a feasible point for the linear and nonlinear constraints’. It can be used to check that the nonlinear constraints are feasible without altering the call to nlp2_solve.

‘Minor Feasibility Tolerance’valueless

nlp2_solve tries to ensure that all variables eventually satisfy their upper and lower bounds to within this tolerance, $r$ . This includes slack variables. Hence, general linear constraints should also be satisfied to within $r$ .

Feasibility with respect to nonlinear constraints is judged by the option ‘Major Feasibility Tolerance’ (not by $r$ ).

If the bounds and linear constraints cannot be satisfied to within $r$ , the problem is declared infeasible. If the corresponding sum of infeasibilities is quite small, it may be appropriate to raise $r$ by a factor of $10$ or $100$ . Otherwise, some error in the data should be suspected.

Nonlinear functions will be evaluated only at points that satisfy the bounds and linear constraints. If there are regions where a function is undefined, every attempt should be made to eliminate these regions from the problem.

For example, if $f (x) = \sqrt{x_{1}} + log (x_{2})$ , it is essential to place lower bounds on both variables. If $r = 1.0 e - 6$ , the bounds $x_{1} \geq 10^{- 5}$ and $x_{2} \geq 10^{- 4}$ might be appropriate. (The log singularity is more serious. In general, keep $x$ as far away from singularities as possible.)

If $‘Scale Option' \geq 1$ , feasibility is defined in terms of the scaled problem (since it is then more likely to be meaningful).

In reality, nlp2_solve uses $r$ as a feasibility tolerance for satisfying the bounds on $x$ and $s$ in each QP subproblem. If the sum of infeasibilities cannot be reduced to zero, the QP subproblem is declared infeasible. nlp2_solve is then in elastic mode thereafter (with only the linearized nonlinear constraints defined to be elastic). See the description of the option ‘Elastic Weight’.

‘Feasibility Tolerance’float

Default $= m a x {10^{- 6}, \sqrt{ϵ}}$

nlp2_solve tries to ensure that all variables eventually satisfy their upper and lower bounds to within this tolerance, $r$ . This includes slack variables. Hence, general linear constraints should also be satisfied to within $r$ .

Feasibility with respect to nonlinear constraints is judged by the option ‘Major Feasibility Tolerance’ (not by $r$ ).

If the bounds and linear constraints cannot be satisfied to within $r$ , the problem is declared infeasible. If the corresponding sum of infeasibilities is quite small, it may be appropriate to raise $r$ by a factor of $10$ or $100$ . Otherwise, some error in the data should be suspected.

Nonlinear functions will be evaluated only at points that satisfy the bounds and linear constraints. If there are regions where a function is undefined, every attempt should be made to eliminate these regions from the problem.

For example, if $f (x) = \sqrt{x_{1}} + log (x_{2})$ , it is essential to place lower bounds on both variables. If $r = 1.0 e - 6$ , the bounds $x_{1} \geq 10^{- 5}$ and $x_{2} \geq 10^{- 4}$ might be appropriate. (The log singularity is more serious. In general, keep $x$ as far away from singularities as possible.)

If $‘Scale Option' \geq 1$ , feasibility is defined in terms of the scaled problem (since it is then more likely to be meaningful).

In reality, nlp2_solve uses $r$ as a feasibility tolerance for satisfying the bounds on $x$ and $s$ in each QP subproblem. If the sum of infeasibilities cannot be reduced to zero, the QP subproblem is declared infeasible. nlp2_solve is then in elastic mode thereafter (with only the linearized nonlinear constraints defined to be elastic). See the description of the option ‘Elastic Weight’.

‘Minor Iterations Limit’int

Default $= 500$

If the number of minor iterations for the optimality phase of the QP subproblem exceeds $i$ , then all nonbasic QP variables that have not yet moved are frozen at their current values and the reduced QP is solved to optimality.

Note that more than $i$ minor iterations may be necessary to solve the reduced QP to optimality. These extra iterations are necessary to ensure that the terminated point gives a suitable direction for the linesearch.

In the major iteration log (see Minor Iteration Log) a t at the end of a line indicates that the corresponding QP was artificially terminated using the limit $i$ .

Compare with the option ‘Iterations Limit’, which defines an independent absolute limit on the total number of minor iterations (summed over all QP subproblems).

‘Minor Print Level’int

Default $= 1$

This controls the amount of output to the ‘Print File’ and the ‘Summary File’ during solution of the QP subproblems. The value of $i$ has the following effect:

$i$	Output
$0$	No minor iteration output except error messages.
$\geq 1$	A single line of output at each minor iteration (controlled by options ‘Print Frequency’ and ‘Summary Frequency’.
$\geq 10$	Basis factorization statistics generated during the periodic refactorization of the basis (see the option ‘Factorization Frequency’). Statistics for the first factorization each major iteration are controlled by the option ‘Major Print Level’.

‘New Basis File’int

Default $= 0$

‘New Basis File’ and ‘Backup Basis File’ are sometimes referred to as basis maps. They contain the most compact representation of the state of each variable. They are intended for restarting the solution of a problem at a point that was reached by an earlier run. For nontrivial problems, it is advisable to save basis maps at the end of a run, in order to restart the run if necessary.

If $i_{1} > 0$ , a basis map will be saved on the file associated with unit $i_{1}$ every $i_{3}$ th iteration. The first record of the file will contain the word PROCEEDING if the run is still in progress. A basis map will also be saved at the end of a run, with some other word indicating the final solution status.

Use of $i_{2} > 0$ is intended as a safeguard against losing the results of a long run. Suppose that a ‘New Basis File’ is being saved every $100$ (‘Save Frequency’) iterations, and that nlp2_solve is about to save such a basis at iteration $2000$ . It is conceivable that the run may be interrupted during the next few milliseconds (in the middle of the save). In this case the Basis file will be corrupted and the run will have been essentially wasted.

To eliminate this risk, both a ‘New Basis File’ and a ‘Backup Basis File’ may be specified. The following would be suitable for the above example:

Backup Basis File 11
New Basis File 12

The current basis will then be saved every $100$ iterations, first on the file associated with unit $12$ and then immediately on the file associated with unit $11$ . If the run is interrupted at iteration $2000$ during the save on the file associated with unit $12$ , there will still be a usable basis on the file associated with unit $11$ (corresponding to iteration $1900$ ).

Note that a new basis will be saved in ‘New Basis File’ at the end of a run if it terminates normally, but it will not be saved in ‘Backup Basis File’. In the above example, if an optimum solution is found at iteration $2050$ (or if the iteration limit is $2050$ ), the final basis in the file associated with unit $12$ will correspond to iteration $2050$ , but the last basis saved in the file associated with unit $11$ will be the one for iteration $2000$ .

A full description of information recorded in ‘New Basis File’ and ‘Backup Basis File’ is given in Gill et al. (2005a).

‘Backup Basis File’int

Default $= 0$

‘New Basis File’ and ‘Backup Basis File’ are sometimes referred to as basis maps. They contain the most compact representation of the state of each variable. They are intended for restarting the solution of a problem at a point that was reached by an earlier run. For nontrivial problems, it is advisable to save basis maps at the end of a run, in order to restart the run if necessary.

If $i_{1} > 0$ , a basis map will be saved on the file associated with unit $i_{1}$ every $i_{3}$ th iteration. The first record of the file will contain the word PROCEEDING if the run is still in progress. A basis map will also be saved at the end of a run, with some other word indicating the final solution status.

Use of $i_{2} > 0$ is intended as a safeguard against losing the results of a long run. Suppose that a ‘New Basis File’ is being saved every $100$ (‘Save Frequency’) iterations, and that nlp2_solve is about to save such a basis at iteration $2000$ . It is conceivable that the run may be interrupted during the next few milliseconds (in the middle of the save). In this case the Basis file will be corrupted and the run will have been essentially wasted.

To eliminate this risk, both a ‘New Basis File’ and a ‘Backup Basis File’ may be specified. The following would be suitable for the above example:

Backup Basis File 11
New Basis File 12

The current basis will then be saved every $100$ iterations, first on the file associated with unit $12$ and then immediately on the file associated with unit $11$ . If the run is interrupted at iteration $2000$ during the save on the file associated with unit $12$ , there will still be a usable basis on the file associated with unit $11$ (corresponding to iteration $1900$ ).

Note that a new basis will be saved in ‘New Basis File’ at the end of a run if it terminates normally, but it will not be saved in ‘Backup Basis File’. In the above example, if an optimum solution is found at iteration $2050$ (or if the iteration limit is $2050$ ), the final basis in the file associated with unit $12$ will correspond to iteration $2050$ , but the last basis saved in the file associated with unit $11$ will be the one for iteration $2000$ .

A full description of information recorded in ‘New Basis File’ and ‘Backup Basis File’ is given in Gill et al. (2005a).

‘Save Frequency’int

Default $= 100$

‘New Basis File’ and ‘Backup Basis File’ are sometimes referred to as basis maps. They contain the most compact representation of the state of each variable. They are intended for restarting the solution of a problem at a point that was reached by an earlier run. For nontrivial problems, it is advisable to save basis maps at the end of a run, in order to restart the run if necessary.

If $i_{1} > 0$ , a basis map will be saved on the file associated with unit $i_{1}$ every $i_{3}$ th iteration. The first record of the file will contain the word PROCEEDING if the run is still in progress. A basis map will also be saved at the end of a run, with some other word indicating the final solution status.

Use of $i_{2} > 0$ is intended as a safeguard against losing the results of a long run. Suppose that a ‘New Basis File’ is being saved every $100$ (‘Save Frequency’) iterations, and that nlp2_solve is about to save such a basis at iteration $2000$ . It is conceivable that the run may be interrupted during the next few milliseconds (in the middle of the save). In this case the Basis file will be corrupted and the run will have been essentially wasted.

To eliminate this risk, both a ‘New Basis File’ and a ‘Backup Basis File’ may be specified. The following would be suitable for the above example:

Backup Basis File 11
New Basis File 12

The current basis will then be saved every $100$ iterations, first on the file associated with unit $12$ and then immediately on the file associated with unit $11$ . If the run is interrupted at iteration $2000$ during the save on the file associated with unit $12$ , there will still be a usable basis on the file associated with unit $11$ (corresponding to iteration $1900$ ).

Note that a new basis will be saved in ‘New Basis File’ at the end of a run if it terminates normally, but it will not be saved in ‘Backup Basis File’. In the above example, if an optimum solution is found at iteration $2050$ (or if the iteration limit is $2050$ ), the final basis in the file associated with unit $12$ will correspond to iteration $2050$ , but the last basis saved in the file associated with unit $11$ will be the one for iteration $2000$ .

A full description of information recorded in ‘New Basis File’ and ‘Backup Basis File’ is given in Gill et al. (2005a).

‘New Superbasics Limit’int

Default $= 99$

This option causes early termination of the QP subproblems if the number of free variables has increased significantly since the first feasible point. If the number of new superbasics is greater than $i$ , the nonbasic variables that have not yet moved are frozen and the resulting smaller QP is solved to optimality.

In the major iteration log (see Major Iteration Log), a t at the end of a line indicates that the QP was terminated early in this way.

‘Old Basis File’int

Default $= 0$

If $i > 0$ , the basis maps information will be obtained from this file. The file will usually have been output previously as a ‘New Basis File’ or ‘Backup Basis File’. A full description of information recorded in ‘New Basis File’ and ‘Backup Basis File’ is given in Gill et al. (2005a).

The file will not be acceptable if the number of rows or columns in the problem has been altered.

‘Partial Price’int

Default $= 1$

This argument is recommended for large problems that have significantly more variables than constraints. It reduces the work required for each ‘pricing’ operation (where a nonbasic variable is selected to become superbasic). When $i = 1$ , all columns of the constraint matrix $(\begin{matrix} A & - I \end{matrix})$ are searched. Otherwise, $A$ and $I$ are partitioned to give $i$ roughly equal segments $A_{j}$ and $I_{j}$ , for $j = 1, 2, \dots, i$ . If the previous pricing search was successful on $A_{j - 1}$ and $I_{j - 1}$ , the next search begins on the segments $A_{j}$ and $I_{j}$ . (All subscripts here are modulo $i$ .) If a reduced gradient is found that is larger than some dynamic tolerance, the variable with the largest such reduced gradient (of appropriate sign) is selected to become superbasic. If nothing is found, the search continues on the next segments $A_{j + 1}$ and $I_{j + 1}$ , and so on.

For time-stage models having $t$ time periods, ‘Partial Price’ $t$ (or $t / 2$ or $t / 3$ ) may be appropriate.

‘Pivot Tolerance’float

Default $= ϵ^{\frac{2}{3}}$

During the solution of QP subproblems, the pivot tolerance is used to prevent columns entering the basis if they would cause the basis to become almost singular.

When $x$ changes to $x + α p$ for some search direction $p$ , a ‘ratio test’ determines which component of $x$ reaches an upper or lower bound first. The corresponding element of $p$ is called the pivot element. Elements of $p$ are ignored (and, therefore, cannot be pivot elements) if they are smaller than the pivot tolerance $r$ .

It is common for two or more variables to reach a bound at essentially the same time. In such cases, the ‘Minor Feasibility Tolerance’ (say, $t$ ) provides some freedom to maximize the pivot element and thereby improve numerical stability. Excessively small values of $t$ should, therefore, not be specified. To a lesser extent, the ‘Expand Frequency’ (say, $f$ ) also provides some freedom to maximize the pivot element. Excessively large values of $f$ should, therefore, not be specified.

‘Print File’int

Default $= 0$

If $i > 0$ , the following information is output to a file associated with unit $i$ during the solution of each problem:

a listing of the options;
some statistics about the problem;
the amount of storage available for the $L U$ factorization of the basis matrix;
notes about the initial basis resulting from a Crash procedure or a Basis file;
the iteration log;
basis factorization statistics;
the exit $errno$ condition and some statistics about the solution obtained;
the printed solution, if requested.

These items are described in Further Comments and Monitoring Information. Further brief output may be directed to the ‘Summary File’.

‘Print Frequency’int

Default $= 100$

If $i > 0$ , one line of the iteration log will be printed every $i$ th iteration. A value such as $i = 10$ is suggested for those interested only in the final solution. If $i \leq 0$ , the value of $i = 99999999$ is used and effectively no checks are made.

‘Proximal Point Method’int

Default $= 1$

$i = 1 or 2$ specifies minimization of ${∥ x - x_{0} ∥}_{1}$ or $\frac{1}{2} {∥ x - x_{0} ∥}_{0}^{2}$ when the starting point $x_{0}$ is changed to satisfy the linear constraints (where $x_{0}$ refers to nonlinear variables).

‘Punch File’int

Default $= 0$

The ‘Punch File’ from a previous run may be used as an ‘Insert File’ for a later run on the same problem. A full description of information recorded in ‘Insert File’ and ‘Punch File’ is given in Gill et al. (2005a).

If $i_{1} > 0$ , the final solution obtained will be output to the file. For linear programs, this format is compatible with various commercial systems.

If $i_{2} > 0$ the ‘Insert File’ containing basis information will be read from unit $i_{2}$ . The file will usually have been output previously as a ‘Punch File’. The file will not be accessed if ‘Old Basis File’ is specified.

‘Insert File’int

Default $= 0$

The ‘Punch File’ from a previous run may be used as an ‘Insert File’ for a later run on the same problem. A full description of information recorded in ‘Insert File’ and ‘Punch File’ is given in Gill et al. (2005a).

If $i_{1} > 0$ , the final solution obtained will be output to the file. For linear programs, this format is compatible with various commercial systems.

If $i_{2} > 0$ the ‘Insert File’ containing basis information will be read from unit $i_{2}$ . The file will usually have been output previously as a ‘Punch File’. The file will not be accessed if ‘Old Basis File’ is specified.

‘QPSolver Cholesky’valueless

Default

Specifies the active-set algorithm used to solve subproblem (1) (see Treatment of Constraint Infeasibilities). ‘QPSolver Cholesky’ holds the full Cholesky factor $R$ of the reduced Hessian $Z^{T} H Z$ . As the QP iterations proceed, the dimension of $R$ changes with the number of superbasic variables. If the number of superbasic variables needs to increase beyond the value of ‘Reduced Hessian Dimension’, the reduced Hessian cannot be stored and the solver switches to ‘QPSolver CG’. The Cholesky solver is reactivated if the number of superbasics stabilizes at a value less than ‘Reduced Hessian Dimension’.

‘QPSolver QN’ solves the QP using a quasi-Newton method. In this case, $R$ is the factor of a quasi-Newton approximate Hessian.

‘QPSolver CG’ uses an active-set method similar to ‘QPSolver QN’, but uses the conjugate-gradient method to solve all systems involving the reduced Hessian.

The Cholesky QP solver is the most robust, but may require a significant amount of computation if there are many superbasics.

The quasi-Newton QP solver does not require computation of the exact $R$ at the start of the subproblem in (1). It may be appropriate when the number of superbasics is large but relatively few iterations are needed to reach a solution (e.g., if nlp2_solve is called with a Warm Start).

The conjugate-gradient QP solver is appropriate for problems with many degrees of freedom (say, more than $2000$ superbasics).

‘QPSolver CG’valueless

Specifies the active-set algorithm used to solve subproblem (1) (see Treatment of Constraint Infeasibilities). ‘QPSolver Cholesky’ holds the full Cholesky factor $R$ of the reduced Hessian $Z^{T} H Z$ . As the QP iterations proceed, the dimension of $R$ changes with the number of superbasic variables. If the number of superbasic variables needs to increase beyond the value of ‘Reduced Hessian Dimension’, the reduced Hessian cannot be stored and the solver switches to ‘QPSolver CG’. The Cholesky solver is reactivated if the number of superbasics stabilizes at a value less than ‘Reduced Hessian Dimension’.

‘QPSolver QN’ solves the QP using a quasi-Newton method. In this case, $R$ is the factor of a quasi-Newton approximate Hessian.

‘QPSolver CG’ uses an active-set method similar to ‘QPSolver QN’, but uses the conjugate-gradient method to solve all systems involving the reduced Hessian.

The Cholesky QP solver is the most robust, but may require a significant amount of computation if there are many superbasics.

The quasi-Newton QP solver does not require computation of the exact $R$ at the start of the subproblem in (1). It may be appropriate when the number of superbasics is large but relatively few iterations are needed to reach a solution (e.g., if nlp2_solve is called with a Warm Start).

The conjugate-gradient QP solver is appropriate for problems with many degrees of freedom (say, more than $2000$ superbasics).

‘QPSolver QN’valueless

Specifies the active-set algorithm used to solve subproblem (1) (see Treatment of Constraint Infeasibilities). ‘QPSolver Cholesky’ holds the full Cholesky factor $R$ of the reduced Hessian $Z^{T} H Z$ . As the QP iterations proceed, the dimension of $R$ changes with the number of superbasic variables. If the number of superbasic variables needs to increase beyond the value of ‘Reduced Hessian Dimension’, the reduced Hessian cannot be stored and the solver switches to ‘QPSolver CG’. The Cholesky solver is reactivated if the number of superbasics stabilizes at a value less than ‘Reduced Hessian Dimension’.

‘QPSolver QN’ solves the QP using a quasi-Newton method. In this case, $R$ is the factor of a quasi-Newton approximate Hessian.

‘QPSolver CG’ uses an active-set method similar to ‘QPSolver QN’, but uses the conjugate-gradient method to solve all systems involving the reduced Hessian.

The Cholesky QP solver is the most robust, but may require a significant amount of computation if there are many superbasics.

The quasi-Newton QP solver does not require computation of the exact $R$ at the start of the subproblem in (1). It may be appropriate when the number of superbasics is large but relatively few iterations are needed to reach a solution (e.g., if nlp2_solve is called with a Warm Start).

The conjugate-gradient QP solver is appropriate for problems with many degrees of freedom (say, more than $2000$ superbasics).

‘Reduced Hessian Dimension’int

Default $= m i n (2000, n)$

This specifies that an $i \times i$ triangular matrix $R$ (to define the reduced Hessian according to $R^{T} R = Z^{T} H Z$ ) is to be available for use by the Cholesky QP solver.

‘Scale Option’int

Default $= 0$

Three scale options are available as follows:

$i$	Meaning
0	No scaling. This is recommended if it is known that $x$ and the constraint matrix never have very large elements (say, larger than $100$ ).
1	The constraints and variables are scaled by an iterative procedure that attempts to make the matrix coefficients as close as possible to $1.0$ (see Fourer (1982)). This will sometimes improve the performance of the solution procedures.
2	The constraints and variables are scaled by the iterative procedure. Also, a certain additional scaling is performed that may be helpful if the right-hand side $b$ or the solution $x$ is large. This takes into account columns of $(\begin{matrix} A & - I \end{matrix})$ that are fixed or have positive lower bounds or negative upper bounds.

Option ‘Scale Tolerance’ affects how many passes might be needed through the constraint matrix. On each pass, the scaling procedure computes the ratio of the largest and smallest nonzero coefficients in each column:

ρ_{j} = {m a x}_{j} ∣ ∣ a_{i j} ∣ ∣ / {m i n}_{i} ∣ ∣ a_{i j} ∣ ∣ (a_{i j} \neq 0) .

If ${m a x}_{j} ρ_{j}$ is less than $r$ times its previous value, another scaling pass is performed to adjust the row and column scales. Raising $r$ from $0.9$ to $0.99$ (say) usually increases the number of scaling passes through $A$ . At most $10$ passes are made. The value of $r$ should lie in the range $0 < r < 1$ .

‘Scale Print’ causes the row scales $r (i)$ and column scales $c (j)$ to be printed to ‘Print File’, if ‘System Information Yes’ has been specified. The scaled matrix coefficients are ${¯ a}_{i j} = a_{i j} c (j) / r (i)$ , and the scaled bounds on the variables and slacks are ${¯ l}_{j} = l_{j} / c (j)$ , ${¯ u}_{j} = u_{j} / c (j)$ , where $c (j) = r (j - n)$ if $j > n$ .

‘Scale Tolerance’float

Default $= 0.9$

Three scale options are available as follows:

$i$	Meaning
0	No scaling. This is recommended if it is known that $x$ and the constraint matrix never have very large elements (say, larger than $100$ ).
1	The constraints and variables are scaled by an iterative procedure that attempts to make the matrix coefficients as close as possible to $1.0$ (see Fourer (1982)). This will sometimes improve the performance of the solution procedures.
2	The constraints and variables are scaled by the iterative procedure. Also, a certain additional scaling is performed that may be helpful if the right-hand side $b$ or the solution $x$ is large. This takes into account columns of $(\begin{matrix} A & - I \end{matrix})$ that are fixed or have positive lower bounds or negative upper bounds.

Option ‘Scale Tolerance’ affects how many passes might be needed through the constraint matrix. On each pass, the scaling procedure computes the ratio of the largest and smallest nonzero coefficients in each column:

ρ_{j} = {m a x}_{j} ∣ ∣ a_{i j} ∣ ∣ / {m i n}_{i} ∣ ∣ a_{i j} ∣ ∣ (a_{i j} \neq 0) .

If ${m a x}_{j} ρ_{j}$ is less than $r$ times its previous value, another scaling pass is performed to adjust the row and column scales. Raising $r$ from $0.9$ to $0.99$ (say) usually increases the number of scaling passes through $A$ . At most $10$ passes are made. The value of $r$ should lie in the range $0 < r < 1$ .

‘Scale Print’ causes the row scales $r (i)$ and column scales $c (j)$ to be printed to ‘Print File’, if ‘System Information Yes’ has been specified. The scaled matrix coefficients are ${¯ a}_{i j} = a_{i j} c (j) / r (i)$ , and the scaled bounds on the variables and slacks are ${¯ l}_{j} = l_{j} / c (j)$ , ${¯ u}_{j} = u_{j} / c (j)$ , where $c (j) = r (j - n)$ if $j > n$ .

‘Scale Print’valueless

Three scale options are available as follows:

$i$	Meaning
0	No scaling. This is recommended if it is known that $x$ and the constraint matrix never have very large elements (say, larger than $100$ ).
1	The constraints and variables are scaled by an iterative procedure that attempts to make the matrix coefficients as close as possible to $1.0$ (see Fourer (1982)). This will sometimes improve the performance of the solution procedures.
2	The constraints and variables are scaled by the iterative procedure. Also, a certain additional scaling is performed that may be helpful if the right-hand side $b$ or the solution $x$ is large. This takes into account columns of $(\begin{matrix} A & - I \end{matrix})$ that are fixed or have positive lower bounds or negative upper bounds.

Option ‘Scale Tolerance’ affects how many passes might be needed through the constraint matrix. On each pass, the scaling procedure computes the ratio of the largest and smallest nonzero coefficients in each column:

ρ_{j} = {m a x}_{j} ∣ ∣ a_{i j} ∣ ∣ / {m i n}_{i} ∣ ∣ a_{i j} ∣ ∣ (a_{i j} \neq 0) .

If ${m a x}_{j} ρ_{j}$ is less than $r$ times its previous value, another scaling pass is performed to adjust the row and column scales. Raising $r$ from $0.9$ to $0.99$ (say) usually increases the number of scaling passes through $A$ . At most $10$ passes are made. The value of $r$ should lie in the range $0 < r < 1$ .

‘Scale Print’ causes the row scales $r (i)$ and column scales $c (j)$ to be printed to ‘Print File’, if ‘System Information Yes’ has been specified. The scaled matrix coefficients are ${¯ a}_{i j} = a_{i j} c (j) / r (i)$ , and the scaled bounds on the variables and slacks are ${¯ l}_{j} = l_{j} / c (j)$ , ${¯ u}_{j} = u_{j} / c (j)$ , where $c (j) = r (j - n)$ if $j > n$ .

‘Solution File’int

Default $= 0$

If $i > 0$ , the final solution will be output to file $i$ (whether optimal or not). All numbers are printed in 1pe16.6 format.

To see more significant digits in the printed solution, it will sometimes be useful to make $i$ refer to ‘Print File’.

‘Start Objective Check At Variable’int

Default $= 1$

These keywords take effect only if $‘Verify Level' > 0$ . They may be used to contol the verification of gradient elements computed by function $o b j f u n$ and/or Jacobian elements computed by function $c o n f u n$ . For eample, if the first $30$ elements of the objective gradient appeared to be correct in an earlier run, so that only element $31$ remains questionable, it is reasonable to specify $‘Start Objective Check At Variable' = 31$ . If the first $30$ variables appear linearly in the objective, so that the corresponding gradient elements are constant, the above choice would also be appropriate.

‘Stop Objective Check At Variable’int

Default $= n$

These keywords take effect only if $‘Verify Level' > 0$ . They may be used to contol the verification of gradient elements computed by function $o b j f u n$ and/or Jacobian elements computed by function $c o n f u n$ . For eample, if the first $30$ elements of the objective gradient appeared to be correct in an earlier run, so that only element $31$ remains questionable, it is reasonable to specify $‘Start Objective Check At Variable' = 31$ . If the first $30$ variables appear linearly in the objective, so that the corresponding gradient elements are constant, the above choice would also be appropriate.

‘Start Constraint Check At Variable’int

Default $= 1$

These keywords take effect only if $‘Verify Level' > 0$ . They may be used to contol the verification of gradient elements computed by function $o b j f u n$ and/or Jacobian elements computed by function $c o n f u n$ . For eample, if the first $30$ elements of the objective gradient appeared to be correct in an earlier run, so that only element $31$ remains questionable, it is reasonable to specify $‘Start Objective Check At Variable' = 31$ . If the first $30$ variables appear linearly in the objective, so that the corresponding gradient elements are constant, the above choice would also be appropriate.

‘Stop Constraint Check At Variable’int

Default $= n$

These keywords take effect only if $‘Verify Level' > 0$ . They may be used to contol the verification of gradient elements computed by function $o b j f u n$ and/or Jacobian elements computed by function $c o n f u n$ . For eample, if the first $30$ elements of the objective gradient appeared to be correct in an earlier run, so that only element $31$ remains questionable, it is reasonable to specify $‘Start Objective Check At Variable' = 31$ . If the first $30$ variables appear linearly in the objective, so that the corresponding gradient elements are constant, the above choice would also be appropriate.

‘Summary File’int

Default $= 0$

If $i_{1} > 0$ , a brief log will be output to the file associated with unit $i_{1}$ , including one line of information every $i_{2}$ th iteration. In an interactive environment, it is useful to direct this output to the terminal, to allow a run to be monitored online. (If something looks wrong, the run can be manually terminated.) Further details are given in The Summary File.

‘Summary Frequency’int

Default $= 100$

If $i_{1} > 0$ , a brief log will be output to the file associated with unit $i_{1}$ , including one line of information every $i_{2}$ th iteration. In an interactive environment, it is useful to direct this output to the terminal, to allow a run to be monitored online. (If something looks wrong, the run can be manually terminated.) Further details are given in The Summary File.

‘Superbasics Limit’int

Default $= n$

This option places a limit on the storage allocated for superbasic variables. Ideally, $i$ should be set slightly larger than the ‘number of degrees of freedom’ expected at an optimal solution.

For nonlinear problems, the number of degrees of freedom is often called the ‘number of independent variables’. Normally, $i$ need not be greater than $n_{N} + 1$ , where $n_{N}$ is the number of nonlinear variables. For many problems, $i$ may be considerably smaller than $n_{N}$ . This will save storage if $n_{N}$ is very large.

‘Suppress Parameters’valueless

Normally nlp2_solve prints the options file as it is being read, and then prints a complete list of the available keywords and their final values. The option ‘Suppress Parameters’ tells nlp2_solve not to print the full list.

‘System Information No’valueless

Default

This option prints additional information on the progress of major and minor iterations, and Crash statistics. See Monitoring Information.

‘System Information Yes’valueless

This option prints additional information on the progress of major and minor iterations, and Crash statistics. See Monitoring Information.

‘Timing Level’int

Default $= 0$

If $i > 0$ , some timing information will be output to the Print file, if $‘Print File' > 0$ .

‘Unbounded Objective’float

Default $= 1.0 e + 15$

These arguments are intended to detect unboundedness in nonlinear problems. During a linesearch, $F$ is evaluated at points of the form $x + α p$ , where $x$ and $p$ are fixed and $α$ varies. If $| F |$ exceeds $r_{1}$ or $α$ exceeds $r_{2}$ , iterations are terminated with the exit message $e r r n o$ = 5.

If singularities are present, unboundedness in $F (x)$ may be manifested by a floating-point overflow (during the evaluation of $F (x + α p)$ ), before the test against $r_{1}$ can be made.

Unboundedness in $x$ is best avoided by placing finite upper and lower bounds on the variables.

‘Unbounded Step Size’float

Default $= bigbnd$

These arguments are intended to detect unboundedness in nonlinear problems. During a linesearch, $F$ is evaluated at points of the form $x + α p$ , where $x$ and $p$ are fixed and $α$ varies. If $| F |$ exceeds $r_{1}$ or $α$ exceeds $r_{2}$ , iterations are terminated with the exit message $e r r n o$ = 5.

If singularities are present, unboundedness in $F (x)$ may be manifested by a floating-point overflow (during the evaluation of $F (x + α p)$ ), before the test against $r_{1}$ can be made.

Unboundedness in $x$ is best avoided by placing finite upper and lower bounds on the variables.

‘Verify Level’int

Default $= 0$

This option refers to finite difference checks on the derivatives computed by the user-supplied functions. Derivatives are checked at the first point that satisfies all bounds and linear constraints.

$i$	Meaning
$0$	Only a ‘cheap’ test will be performed, requiring two calls to $c o n f u n$ .
$1$	Individual gradients will be checked (with a more reliable test). A key of the form OK or Bad? indicates whether or not each component appears to be correct.
$2$	Individual columns of the problem Jacobian will be checked.
$3$	Options 2 and 1 will both occur (in that order).
$- 1$	Derivative checking is disabled.

$‘Verify Level' = 3$ should be specified whenever a new function function is being developed. Missing derivatives are not checked, so they result in no overhead.

‘Violation Limit’float

Default $= 1.0 e + 6$

This keyword defines an absolute limit on the magnitude of the maximum constraint violation, $r$ , after the linesearch. On completion of the linesearch, the new iterate $x_{k + 1}$ satisfies the condition

v_{i} (x_{k + 1}) \leq r m a x (1, v_{i} (x_{0})),

where $x_{0}$ is the point at which the nonlinear constraints are first evaluated and $v_{i} (x)$ is the $i$ th nonlinear constraint violation $v_{i} (x) = m a x (0, l_{i} - c_{i} (x), c_{i} (x) - u_{i})$ .

The effect of this violation limit is to restrict the iterates to lie in an expanded feasible region whose size depends on the magnitude of $r$ . This makes it possible to keep the iterates within a region where the objective is expected to be well defined and bounded below. If the obective is bounded below for all values of the variables, $r$ may be any large positive value.

Raises

NagValueError

(errno $1$ )

The initialization function nlp2_init() has not been called.

(errno $2$ )

On entry, bounds $b l$ and $b u$ for $⟨ v a l u e ⟩$ are equal and infinite. $b l = b u = ⟨ v a l u e ⟩$ and $bigbnd = ⟨ v a l u e ⟩$ .

(errno $2$ )

On entry, bounds for $⟨ v a l u e ⟩$ are inconsistent. $b l = ⟨ v a l u e ⟩$ and $b u = ⟨ v a l u e ⟩$ .

(errno $2$ )

On entry, bounds $b l$ and $b u$ for $⟨ v a l u e ⟩$ $⟨ v a l u e ⟩$ are equal and infinite. $b l = b u = ⟨ v a l u e ⟩$ and $bigbnd = ⟨ v a l u e ⟩$ .

(errno $2$ )

On entry, bounds for $⟨ v a l u e ⟩$ $⟨ v a l u e ⟩$ are inconsistent. $b l = ⟨ v a l u e ⟩$ and $b u = ⟨ v a l u e ⟩$ .

(errno $2$ )

Basis file dimensions do not match this problem.

(errno $2$ )

On entry, $n = ⟨ v a l u e ⟩$ .

Constraint: $n \geq ⟨ v a l u e ⟩$ .

(errno $2$ )

On entry, $nclin = ⟨ v a l u e ⟩$ .

Constraint: $nclin \geq ⟨ v a l u e ⟩$ .

(errno $2$ )

On entry, $ncnln = ⟨ v a l u e ⟩$ .

Constraint: $ncnln \geq ⟨ v a l u e ⟩$ .

(errno $8$ )

User-supplied function computes incorrect objective derivatives.

(errno $8$ )

User-supplied function computes incorrect constraint derivatives.

(errno $11$ )

Internal error: memory allocation failed when attempting to allocate workspace sizes $⟨ v a l u e ⟩$ and $⟨ v a l u e ⟩$ . Please contact NAG.

(errno $12$ )

Internal memory allocation was insufficient. Please contact NAG.

(errno $13$ )

An error has occurred in the basis package, perhaps indicating incorrect setup of arrays. Set the option ‘Print File’ and examine the output carefully for further information.

(errno $14$ )

An unexpected error has occurred. Set the option ‘Print File’ and examine the output carefully for further information.

Warns

NagAlgorithmicWarning

(errno $3$ ): The requested accuracy could not be achieved.
(errno $4$ ): The linear constraints appear to be infeasible.
(errno $4$ ): The problem appears to be infeasible. The linear equality constraints could not be satisfied.
(errno $4$ ): The problem appears to be infeasible. Nonlinear infeasibilites have been minimized.
(errno $4$ ): The problem appears to be infeasible. Infeasibilites have been minimized.
(errno $5$ ): The problem appears to be unbounded. The objective function is unbounded.
(errno $5$ ): The problem appears to be unbounded. The constraint violation limit has been reached.
(errno $7$ ): Numerical difficulties have been encountered and no further progress can be made.
(errno $10$ ): User-supplied constraint function requested termination.
(errno $10$ ): User-supplied objective function requested termination.

NagAlgorithmicMajorWarning

(errno $6$ ): Iteration limit reached.
(errno $6$ ): Major iteration limit reached.
(errno $6$ ): The value of the option ‘Superbasics Limit’ is too small.
(errno $9$ ): User-supplied function is undefined at the first feasible point.
(errno $9$ ): User-supplied function is undefined at the initial point.
(errno $9$ ): Unable to proceed into undefined region of user-supplied function.

Notes

nlp2_solve is designed to solve nonlinear programming problems – the minimization of a smooth nonlinear function subject to a set of constraints on the variables. nlp2_solve is suitable for small dense problems. The problem is assumed to be stated in the following form:

\begin{matrix} {minimize}_{x \in R^{n}} F (x) subject to l \leq ⎛ ⎜ ⎝ \begin{matrix} x A_{L} x c (x) \end{matrix} ⎞ ⎟ ⎠ \leq u, \end{matrix}

where $F (x)$ (the objective function) is a nonlinear scalar function, $A_{L}$ is an $n_{L} \times n$ constant matrix, and $c (x)$ is an $n_{N}$ -vector of nonlinear constraint functions. (The matrix $A_{L}$ and the vector $c (x)$ may be empty.) The objective function and the constraint functions are assumed to be smooth, here meaning at least twice-continuously differentiable. (The method of nlp2_solve will usually solve (1) if there are only isolated discontinuities away from the solution.) We also write $r (x)$ for the vector of combined functions:

r (x) = {(\begin{matrix} x & A_{L} x & c (x) \end{matrix})}^{T} .

Note that although the bounds on the variables could be included in the definition of the linear constraints, we prefer to distinguish between them for reasons of computational efficiency. For the same reason, the linear constraints should not be included in the definition of the nonlinear constraints. Upper and lower bounds are specified for all the variables and for all the constraints. An equality constraint on $r_{i}$ can be specified by setting $l_{i} = u_{i}$ . If certain bounds are not present, the associated elements of $l$ or $u$ can be set to special values that will be treated as $- \infty$ or $+ \infty$ . (See the description of the option ‘Infinite Bound Size’.)

Figure [label omitted] illustrates the feasible region for the $j$ th pair of constraints $l_{j} \leq r_{j} (x) \leq u_{j}$ . The quantity of $δ$ is the ‘Feasibility Tolerance’, which can be set by you (see Other Parameters). The constraints $l_{j} \leq r_{j} \leq u_{j}$ are considered ‘satisfied’ if $r_{j}$ lies in Regions 2, 3 or 4, and ‘inactive’ if $r_{j}$ lies in Region 3. The constraint $r_{j} \geq l_{j}$ is considered ‘active’ in Region 2, and ‘violated’ in Region 1. Similarly, $r_{j} \leq u_{j}$ is active in Region 4, and violated in Region 5. For equality constraints ( $l_{j} = u_{j}$ ), Regions 2 and 4 are the same and Region 3 is empty.

[figure omitted]

If there are no nonlinear constraints in (1) and $F$ is linear or quadratic, then it will generally be more efficient to use one of lp_solve(), lsq_lincon_solve() or qp_dense_solve(). If the problem is large and sparse and does have nonlinear constraints, then handle_solve_ssqp() should be used, since nlp2_solve treats all matrices as dense.

You must supply an initial estimate of the solution to (1), together with functions that define $F (x)$ and $c (x)$ with as many first partial derivatives as possible; unspecified derivatives are approximated by finite differences; see Other Parameters for a discussion of the option ‘Derivative Level’.

The objective function is defined by $o b j f u n$ , and the nonlinear constraints are defined by $c o n f u n$ . Note that if there are any nonlinear constraints then the first call to $c o n f u n$ will precede the first call to $o b j f u n$ .

For maximum reliability, it is preferable for you to provide all partial derivatives (see Module 8 of Gill et al. (1981), for a detailed discussion). If all gradients cannot be provided, it is similarly advisable to provide as many as possible. While developing $o b j f u n$ and $c o n f u n$ , the option ‘Verify Level’ should be used to check the calculation of any known gradients.

The method used by nlp2_solve is based on NPOPT, which is part of the SNOPT package described in Gill et al. (2005b), and the algorithm it uses is described in detail in Algorithmic Details.

References

Eldersveld, S K, 1991, Large-scale sequential quadratic programming algorithms, PhD Thesis, Department of Operations Research, Stanford University, Stanford

Fourer, R, 1982, Solving staircase linear programs by the simplex method, Math. Programming (23), 274–313

Gill, P E, Murray, W and Saunders, M A, 2002, SNOPT: An SQP Algorithm for Large-scale Constrained Optimization (12), 979–1006, SIAM J. Optim.

Gill, P E, Murray, W and Saunders, M A, 2005, Users’ guide for SQOPT 7: a Fortran package for large-scale linear and quadratic programming, Report NA 05-1, Department of Mathematics, University of California, San Diego, https://www.ccom.ucsd.edu/~peg/papers/sqdoc7.pdf

Gill, P E, Murray, W and Saunders, M A, 2005, Users’ guide for SNOPT 7.1: a Fortran package for large-scale linear nonlinear programming, Report NA 05-2, Department of Mathematics, University of California, San Diego, https://www.ccom.ucsd.edu/~peg/papers/sndoc7.pdf

Gill, P E, Murray, W, Saunders, M A and Wright, M H, 1986, Users’ guide for NPSOL (Version 4.0): a Fortran package for nonlinear programming, Report SOL 86-2, Department of Operations Research, Stanford University

Gill, P E, Murray, W, Saunders, M A and Wright, M H, 1992, Some theoretical properties of an augmented Lagrangian merit function, Advances in Optimization and Parallel Computing, (ed P M Pardalos), 101–128, North Holland

Gill, P E, Murray, W and Wright, M H, 1981, Practical Optimization, Academic Press

Hock, W and Schittkowski, K, 1981, Test Examples for Nonlinear Programming Codes. Lecture Notes in Economics and Mathematical Systems (187), Springer–Verlag

NAG and Python

Return to Front

naginterfaces.library.opt.nlp2_solve¶

naginterfaces.library.opt.nlp2_​solve¶

naginterfaces.library.opt.nlp2_solve¶