naginterfaces.library.opt.nlp1_sparse_solve¶

naginterfaces.library.opt.nlp1_sparse_solve(m, ncnln, nonln, njnln, iobj, a, ha, ka, bl, bu, start, names, ns, xs, istate, clamda, comm, confun=None, objfun=None, leniz=None, lenz=500, data=None, io_manager=None)[source]¶

nlp1_sparse_solve solves sparse nonlinear programming problems.

Note: this function uses optional algorithmic parameters, see also: nlp1_sparse_option_file(), nlp1_sparse_option_string(), nlp1_init().

Deprecated since version 28.3.0.0: nlp1_sparse_solve is deprecated. Please use handle_solve_ssqp() instead. See also the Replacement Calls document.

For full information please refer to the NAG Library document for e04ug

https://support.nag.com/numeric/nl/nagdoc_30.3/flhtml/e04/e04ugf.html

Parameters

mint

$m$ , the number of general constraints (or slacks). This is the number of rows in $A$ , including the free row (if any; see $i o b j$ ). Note that $A$ must contain at least one row. If your problem has no constraints, or only upper and lower bounds on the variables, then you must include a dummy ‘free’ row consisting of a single (zero) element subject to ‘infinite’ upper and lower bounds. Further details can be found under the descriptions for $i o b j$ , $nnz$ , $a$ , $h a$ , $k a$ , $b l$ and $b u$ .

ncnlnint

$n_{N}$ , the number of nonlinear constraints.

nonlnint

$n_{1}^{'}$ , the number of nonlinear objective variables. If the objective function is nonlinear, the leading $n_{1}^{'}$ columns of $A$ belong to the nonlinear objective variables. (See also the description for $n j n l n$ .)

njnlnint

$n_{1}^{''}$ , the number of nonlinear Jacobian variables. If there are any nonlinear constraints, the leading $n_{1}^{''}$ columns of $A$ belong to the nonlinear Jacobian variables. If $n_{1}^{'} > 0$ and $n_{1}^{''} > 0$ , the nonlinear objective and Jacobian variables overlap. The total number of nonlinear variables is given by $¯ n = m a x (n_{1}^{'}, n_{1}^{''})$ .

iobjint

If $i o b j > n c n l n$ , row $i o b j$ of $A$ is a free row containing the nonzero elements of the linear part of the objective function.

$i o b j = 0$

There is no free row.

$i o b j = - 1$

There is a dummy ‘free’ row.

afloat, array-like, shape $(nnz)$

The nonzero elements of the Jacobian matrix $A$ , ordered by increasing column index. Since the constraint Jacobian matrix $J (x^{''})$ must always appear in the top left-hand corner of $A$ , those elements in a column associated with any nonlinear constraints must come before any elements belonging to the linear constraint matrix $G$ and the free row (if any; see $i o b j$ ).

In general, $a$ is partitioned into a nonlinear part and a linear part corresponding to the nonlinear variables and linear variables in the problem.

Elements in the nonlinear part may be set to any value (e.g., zero) because they are initialized at the first point that satisfies the linear constraints and the upper and lower bounds.

If $‘Derivative Level' = 2$ or $3$ , the nonlinear part may also be used to store any constant Jacobian elements.

Note that if $c o n f u n$ does not define the constant Jacobian element $f j a c [i - 1]$ , then the missing value will be obtained directly from $a [j]$ for some $j \geq i$ .

If $‘Derivative Level' = 0$ or $1$ , unassigned elements of $f j a c$ are not treated as constant; they are estimated by finite differences, at nontrivial expense.

The linear part must contain the nonzero elements of $G$ and the free row (if any).

If $i o b j = - 1$ , set $a [0] = 0$ .

Elements with the same row and column indices are not allowed. (See also the descriptions for $h a$ and $k a$ .)

haint, array-like, shape $(nnz)$

$h a [i - 1]$ must contain the row index of the nonzero element stored in $a [i - 1]$ , for $i = 1, 2, \dots, nnz$ . The row indices for a column may be supplied in any order subject to the condition that those elements in a column associated with any nonlinear constraints must appear before those elements associated with any linear constraints (including the free row, if any). Note that $c o n f u n$ must define the Jacobian elements in the same order. If $i o b j = - 1$ , set $h a [0] = 1$ .

kaint, array-like, shape $(n + 1)$

$k a [j - 1]$ must contain the index in $a$ of the start of the $j$ th column, for $j = 1, 2, \dots, n$ . To specify the $j$ th column as empty, set $k a [j - 1] = k a [j]$ . Note that the first and last elements of $k a$ must be such that $k a [0] = 1$ and $k a [n] = nnz + 1$ . If $i o b j = - 1$ , set $k a [j - 1] = 2$ , for $j = 2, 3, \dots, n$ .

blfloat, array-like, shape $(n + m)$

$l$ , the lower bounds for all the variables and general constraints, in the following order. The first $n$ elements of $b l$ must contain the bounds on the variables $x$ , the next $n c n l n$ elements the bounds for the nonlinear constraints $F (x)$ (if any) and the next ( $m - n c n l n$ ) elements the bounds for the linear constraints $G x$ and the free row (if any). To specify a nonexistent lower bound (i.e., $l_{j} = - \infty$ ), set $b l [j - 1] \leq - bigbnd$ . To specify the $j$ th constraint as an equality, set $b l [j - 1] = b u [j - 1] = β$ , say, where $| β | < bigbnd$ . If $i o b j = - 1$ , set $b l [n + a b s (i o b j) - 1] \leq - bigbnd$ .

bufloat, array-like, shape $(n + m)$

$u$ , the upper bounds for all the variables and general constraints, in the following order. The first $n$ elements of $b u$ must contain the bounds on the variables $x$ , the next $n c n l n$ elements the bounds for the nonlinear constraints $F (x)$ (if any) and the next ( $m - n c n l n$ ) elements the bounds for the linear constraints $G x$ and the free row (if any). To specify a nonexistent upper bound (i.e., $u_{j} = + \infty$ ), set $b u [j - 1] \geq bigbnd$ . To specify the $j$ th constraint as an equality, set $b u [j - 1] = b l [j - 1] = β$ , say, where $| β | < bigbnd$ . If $i o b j = - 1$ , set $b u [n + a b s (i o b j) - 1] \geq bigbnd$ .

startstr, length 1

Indicates how a starting basis is to be obtained.

$s t a r t ='C'$

An internal Crash procedure will be used to choose an initial basis.

$s t a r t ='W'$

A basis is already defined in $i s t a t e$ and $n s$ (probably from a previous call).

namesstr, length 8, array-like, shape $(nname)$

Specifies the column and row names to be used in the printed output.

If $nname = 1$ , $n a m e s$ is not referenced and the printed output will use default names for the columns and rows.

If $nname = n + m$ , the first $n$ elements must contain the names for the columns, the next $n c n l n$ elements must contain the names for the nonlinear rows (if any) and the next $(m - n c n l n)$ elements must contain the names for the linear rows (if any) to be used in the printed output.

Note that the name for the free row or dummy ‘free’ row must be stored in $n a m e s [n + a b s (i o b j) - 1]$ .

nsint

$n_{S}$ , the number of superbasics. It need not be specified if $s t a r t ='C'$ , but must retain its value from a previous call when $s t a r t ='W'$ .

xsfloat, array-like, shape $(n + m)$

The initial values of the variables and slacks $(x, s)$ . (See the description for $i s t a t e$ .)

istateint, array-like, shape $(n + m)$

If $s t a r t ='C'$ , the first $n$ elements of $i s t a t e$ and $x s$ must specify the initial states and values, respectively, of the variables $x$ . (The slacks $s$ need not be initialized.) An internal Crash procedure is then used to select an initial basis matrix $B$ . The initial basis matrix will be triangular (neglecting certain small elements in each column). It is chosen from various rows and columns of $(\begin{matrix} A & - I \end{matrix})$ . Possible values for $i s t a t e [j - 1]$ are as follows:

$i s t a t e [j - 1]$	State of $x s [j - 1]$ during Crash procedure
$0$ or $1$	Eligible for the basis
$2$	Ignored
$3$	Eligible for the basis (given preference over $0$ or $1$ )
$4$ or $5$	Ignored

If nothing special is known about the problem, or there is no wish to provide special information, you may set $i s t a t e [j - 1] = 0$ and $x s [j - 1] = 0.0$ , for $j = 1, 2, \dots, n$ .

All variables will then be eligible for the initial basis.

Less trivially, to say that the $j$ th variable will probably be equal to one of its bounds, set $i s t a t e [j - 1] = 4$ and $x s [j - 1] = b l [j - 1]$ or $i s t a t e [j - 1] = 5$ and $x s [j - 1] = b u [j - 1]$ as appropriate.

Following the Crash procedure, variables for which $i s t a t e [j - 1] = 2$ are made superbasic.

Other variables not selected for the basis are then made nonbasic at the value $x s [j - 1]$ if $b l [j - 1] \leq x s [j - 1] \leq b u [j - 1]$ , or at the value $b l [j - 1]$ or $b u [j - 1]$ closest to $x s [j - 1]$ .

If $s t a r t ='W'$ , $i s t a t e$ and $x s$ must specify the initial states and values, respectively, of the variables and slacks $(x, s)$ .

If the function has been called previously with the same values of $n$ and $m$ , $i s t a t e$ already contains satisfactory information.

clamdafloat, array-like, shape $(n + m)$

If $n c n l n > 0$ , $c l a m d a [j - 1]$ must contain a Lagrange multiplier estimate for the $j$ th nonlinear constraint $F_{j} (x)$ , for $j = n + 1, \dots, n + n c n l n$ . If nothing special is known about the problem, or there is no wish to provide special information, you may set $c l a m d a [j - 1] = 0.0$ . The remaining elements need not be set.

commdict, communication object, modified in place

Communication structure.

This argument must have been initialized by a prior call to nlp1_init().

confunNone or callable (mode, f, fjac) = confun(mode, ncnln, x, fjac, nstate, data=None), optional

Note: if this argument is None then a NAG-supplied facility will be used.

$c o n f u n$ must calculate the vector $F (x)$ of nonlinear constraint functions and (optionally) its Jacobian $(= \frac{\partial F}{\partial x})$ for a specified $n_{1}^{''}$ ( $\leq n$ ) element vector $x$ .

If there are no nonlinear constraints (i.e., $n c n l n = 0$ ), $c o n f u n$ will never be called by nlp1_sparse_solve and $c o n f u n$ may be None. If there are nonlinear constraints, the first call to $c o n f u n$ will occur before the first call to $o b j f u n$ .

Parameters

modeint

Indicates which values must be assigned during each call of $c o n f u n$ . Only the following values need be assigned:

$m o d e = 0$

$f$ .

$m o d e = 1$

All available elements of $f j a c$ .

$m o d e = 2$

$f$ and all available elements of $f j a c$ .

ncnlnint

$n_{N}$ , the number of nonlinear constraints. These must be the first $n c n l n$ constraints in the problem.

xfloat, ndarray, shape $(njnln)$

$x$ , the vector of nonlinear Jacobian variables at which the nonlinear constraint functions and/or the available elements of the constraint Jacobian are to be evaluated.

fjacfloat, ndarray, shape $(nnzjac)$

The elements of $f j a c$ are set to special values which enable nlp1_sparse_solve to detect whether they are changed by $c o n f u n$ .

nstateint

If $n s t a t e = 1$ , then nlp1_sparse_solve is calling $c o n f u n$ for the first time. This argument setting allows you to save computation time if certain data must be read or calculated only once.

If $n s t a t e \geq 2$ , nlp1_sparse_solve is calling $c o n f u n$ for the last time.

This argument setting allows you to perform some additional computation on the final solution.

In general, the last call to $c o n f u n$ is made with $n s t a t e = 2 + errno$ (see Exceptions).

Otherwise, $n s t a t e = 0$ .

dataarbitrary, optional, modifiable in place

User-communication data for callback functions.

Returns

modeint

You may set to a negative value as follows:

$m o d e \leq - 2$

The solution to the current problem is terminated and in this case nlp1_sparse_solve will terminate with $errno$ set to $m o d e$ .

$m o d e = - 1$

The nonlinear constraint functions cannot be calculated at the current $x$ . nlp1_sparse_solve will then terminate with $e r r n o$ = -1 unless this occurs during the linesearch; in this case, the linesearch will shorten the step and try again.

ffloat, array-like, shape $(n c n l n)$

If $m o d e = 0$ or $2$ , $f [i - 1]$ must contain the value of the $i$ th nonlinear constraint function at $x$ .

fjacfloat, array-like, shape $(nnzjac)$

If $m o d e = 1$ or $2$ , $f j a c$ must return the available elements of the constraint Jacobian evaluated at $x$ . These elements must be stored in exactly the same positions as implied by the definitions of the arrays $a$ , $h a$ and $k a$ . If option $‘Derivative Level' = 2$ or $3$ , the value of any constant Jacobian element not defined by $c o n f u n$ will be obtained directly from $a$ . Note that the function does not perform any internal checks for consistency (except indirectly via the option ‘Verify Level’), so great care is essential.

objfunNone or callable (mode, objf, objgrd) = objfun(mode, x, objgrd, nstate, data=None), optional

Note: if this argument is None then a NAG-supplied facility will be used.

$o b j f u n$ must calculate the nonlinear part of the objective function $f (x)$ and (optionally) its gradient $(= \frac{\partial f}{\partial x})$ for a specified $n_{1}^{'}$ ( $\leq n$ ) element vector $x$ .

If there are no nonlinear objective variables (i.e., $n o n l n = 0$ ), $o b j f u n$ will never be called by nlp1_sparse_solve and $o b j f u n$ may be None.

Parameters

modeint

Indicates which values must be assigned during each call of $o b j f u n$ . Only the following values need be assigned:

$m o d e = 0$

$o b j f$ .

$m o d e = 1$

All available elements of $o b j g r d$ .

$m o d e = 2$

$o b j f$ and all available elements of $o b j g r d$ .

xfloat, ndarray, shape $(nonln)$

$x$ , the vector of nonlinear variables at which the nonlinear part of the objective function and/or all available elements of its gradient are to be evaluated.

objgrdfloat, ndarray, shape $(nonln)$

The elements of $o b j g r d$ are set to special values which enable nlp1_sparse_solve to detect whether they are changed by $o b j f u n$ .

nstateint

If $n s t a t e = 1$ , nlp1_sparse_solve is calling $o b j f u n$ for the first time. This argument setting allows you to save computation time if certain data must be read or calculated only once.

If $n s t a t e \geq 2$ , nlp1_sparse_solve is calling $o b j f u n$ for the last time.

This argument setting allows you to perform some additional computation on the final solution.

In general, the last call to $o b j f u n$ is made with $n s t a t e = 2 + errno$ (see Exceptions).

Otherwise, $n s t a t e = 0$ .

dataarbitrary, optional, modifiable in place

User-communication data for callback functions.

Returns

modeint

You may set to a negative value as follows:

$m o d e \leq - 2$

The solution to the current problem is terminated and in this case nlp1_sparse_solve will terminate with $errno$ set to $m o d e$ .

$m o d e = - 1$

The nonlinear part of the objective function cannot be calculated at the current $x$ . nlp1_sparse_solve will then terminate with $e r r n o$ = -1 unless this occurs during the linesearch; in this case, the linesearch will shorten the step and try again.

objffloat

If $m o d e = 0$ or $2$ , $o b j f$ must be set to the value of the objective function at $x$ .

objgrdfloat, array-like, shape $(nonln)$

If $m o d e = 1$ or $2$ , $o b j g r d$ must return the available elements of the gradient evaluated at $x$ .

lenizNone or int, optional

Note: if this argument is None then a default value will be used, determined as follows: $max (500, n + m)$ .

The dimension of the internal workspace array $iz$ .

lenzint, optional

The dimension of the internal workspace array $z$ .

dataarbitrary, optional

User-communication data for callback functions.

io_managerFileObjManager, optional

Manager for I/O in this routine.

Returns

afloat, ndarray, shape $(nnz)$

Elements in the nonlinear part corresponding to nonlinear Jacobian variables are overwritten.

nsint

The final number of superbasics.

xsfloat, ndarray, shape $(n + m)$

The final values of the variables and slacks $(x, s)$ .

istateint, ndarray, shape $(n + m)$

The final states of the variables and slacks $(x, s)$ . The significance of each possible value of $i s t a t e [j - 1]$ is as follows:

$i s t a t e [j - 1]$	State of variable $j$	Normal value of $x s [j - 1]$
$0$	Nonbasic	$b l [j - 1]$
$1$	Nonbasic	$b u [j - 1]$
$2$	Superbasic	Between $b l [j - 1]$ and $b u [j - 1]$
$3$	Basic	Between $b l [j - 1]$ and $b u [j - 1]$

If $n i n f = 0$ , basic and superbasic variables may be outside their bounds by as much as the value of the option ‘Minor Feasibility Tolerance’.

Note that if scaling is specified, the option ‘Minor Feasibility Tolerance’ applies to the variables of the scaled problem.

In this case, the variables of the original problem may be as much as $0.1$ outside their bounds, but this is unlikely unless the problem is very badly scaled.

Very occasionally some nonbasic variables may be outside their bounds by as much as the option ‘Minor Feasibility Tolerance’ and there may be some nonbasic variables for which $x s [j - 1]$ lies strictly between its bounds.

If $n i n f > 0$ , some basic and superbasic variables may be outside their bounds by an arbitrary amount (bounded by $s i n f$ if scaling was not used).

clamdafloat, ndarray, shape $(n + m)$

A set of Lagrange multipliers for the bounds on the variables (reduced costs) and the general constraints (shadow costs). More precisely, the first $n$ elements contain the multipliers for the bounds on the variables, the next $n c n l n$ elements contain the multipliers for the nonlinear constraints $F (x)$ (if any) and the next ( $m - n c n l n$ ) elements contain the multipliers for the linear constraints $G x$ and the free row (if any).

minizint

The minimum value of $l e n i z$ required to start solving the problem. If $e r r n o$ = 12, nlp1_sparse_solve may be called again with $l e n i z$ suitably larger than $m i n i z$ . (The bigger the better, since it is not certain how much workspace the basis factors need.)

minzint

The minimum value of $l e n z$ required to start solving the problem. If $e r r n o$ = 13, nlp1_sparse_solve may be called again with $l e n z$ suitably larger than $m i n z$ . (The bigger the better, since it is not certain how much workspace the basis factors need.)

ninfint

The number of constraints that lie outside their bounds by more than the value of the option ‘Minor Feasibility Tolerance’.

If the linear constraints are infeasible, the sum of the infeasibilities of the linear constraints is minimized subject to the upper and lower bounds being satisfied.

In this case, $n i n f$ contains the number of elements of $G x$ that lie outside their upper or lower bounds.

Note that the nonlinear constraints are not evaluated.

Otherwise, the sum of the infeasibilities of the nonlinear constraints is minimized subject to the linear constraints and the upper and lower bounds being satisfied.

In this case, $n i n f$ contains the number of elements of $F (x)$ that lie outside their upper or lower bounds.

sinffloat

The sum of the infeasibilities of constraints that lie outside their bounds by more than the value of the option ‘Minor Feasibility Tolerance’.

objfloat

The value of the objective function.

Other Parameters

‘Central Difference Interval’float

Default $= \sqrt[3]{‘Function Precision'}$

Note that this option does not apply when $‘Derivative Level' = 3$ .

The value of $r$ is used near an optimal solution in order to obtain more accurate (but more expensive) estimates of gradients. This requires twice as many function evaluations as compared to using forward differences (see option ‘Forward Difference Interval’). The interval used for the $j$ th variable is $h_{j} = r (1 + ∣ ∣ x_{j} ∣ ∣)$ . The resulting gradient estimates should be accurate to $O (r^{2})$ , unless the functions are badly scaled. The switch to central differences is indicated by c at the end of each line of intermediate printout produced by the major iterations (see Major Iteration Printout). See Gill et al. (1981) for a discussion of the accuracy in finite difference approximations.

If $r \leq 0$ , the default value is used.

‘Check Frequency’int

Default $= 60$

Every $i$ th minor iteration after the most recent basis factorization, a numerical test is made to see if the current solution $(x, s)$ satisfies the general linear constraints (including any linearized nonlinear constraints). The constraints are of the form $A x - s = b$ , where $s$ is the set of slack variables. If the largest element of the residual vector $r = b - A x + s$ is judged to be too large, the current basis is refactorized and the basic variables recomputed to satisfy the general constraints more accurately.

If $i < 0$ , the default value is used. If $i = 0$ , the value $i = 99999999$ is used and effectively no checks are made.

‘Crash Option’int

Default $= 0 or 3$

The default value of $i$ is $0$ if there are any nonlinear constraints and $3$ otherwise. Note that this option does not apply when $s t a r t ='W'$ (see Parameters).

If $s t a r t ='C'$ , an internal Crash procedure is used to select an initial basis from various rows and columns of the constraint matrix $(\begin{matrix} A & - I \end{matrix})$ . The value of $i$ determines which rows and columns of $A$ are initially eligible for the basis and how many times the Crash procedure is called. Columns of $- I$ are used to pad the basis where necessary. The possible choices for $i$ are the following.

$i$	Meaning
0	The initial basis contains only slack variables: $B = I$ .
1	The Crash procedure is called once (looking for a triangular basis in all rows and columns of $A$ ).
2	The Crash procedure is called twice (if there are any nonlinear constraints). The first call looks for a triangular basis in linear rows and the iteration proceeds with simplex iterations until the linear constraints are satisfied. The Jacobian is then evaluated for the first major iteration and the Crash procedure is called again to find a triangular basis in the nonlinear rows (whilst retaining the current basis for linear rows).
3	The Crash procedure is called up to three times (if there are any nonlinear constraints). The first two calls treat linear equality constraints and linear inequality constraints separately. The Jacobian is then evaluated for the first major iteration and the Crash procedure is called again to find a triangular basis in the nonlinear rows (whilst retaining the current basis for linear rows).

If $i < 0$ or $i > 3$ , the default value is used.

If $i \geq 1$ , certain slacks on inequality rows are selected for the basis first. (If $i \geq 2$ , numerical values are used to exclude slacks that are close to a bound.) The Crash procedure then makes several passes through the columns of $A$ , searching for a basis matrix that is essentially triangular. A column is assigned to ‘pivot’ on a particular row if the column contains a suitably large element in a row that has not yet been assigned. (The pivot elements ultimately form the diagonals of the triangular basis.) For remaining unassigned rows, slack variables are inserted to complete the basis.

‘Crash Tolerance’float

Default $= 0.1$

The value $r$ ( $0 \leq r < 1$ ) allows the Crash procedure to ignore certain ‘small’ nonzero elements in the columns of $A$ while searching for a triangular basis. If $a_{m a x}$ is the largest element in the $j$ th column, other nonzeros $a_{i j}$ in the column are ignored if $∣ ∣ a_{i j} ∣ ∣ \leq a_{m a x} \times r$ .

When $r > 0$ , the basis obtained by the Crash procedure may not be strictly triangular, but it is likely to be nonsingular and almost triangular. The intention is to obtain a starting basis containing more columns of $A$ and fewer (arbitrary) slacks. A feasible solution may be reached earlier on some problems.

If $r < 0$ or $r \geq 1$ , the default value is used.

‘Defaults’valueless

This special keyword may be used to reset all options to their default values.

‘Derivative Level’int

Default $= 3$

This argument indicates which nonlinear function gradients are provided in functions $o b j f u n$ and $c o n f u n$ . The possible choices for $i$ are the following.

$i$	Meaning
3	All elements of the objective gradient and the constraint Jacobian are provided.
2	All elements of the constraint Jacobian are provided, but some (or all) elements of the objective gradient are not specified.
1	All elements of the objective gradient are provided, but some (or all) elements of the constraint Jacobian are not specified.
0	Some (or all) elements of both the objective gradient and the constraint Jacobian are not specified.

The default value $i = 3$ should be used whenever possible. It is the most reliable and will usually be the most efficient.

If $i = 0$ or $2$ , nlp1_sparse_solve will estimate the unspecified elements of the objective gradient, using finite differences. This may simplify the coding of $o b j f u n$ . However, the computation of finite difference approximations usually increases the total run-time substantially (since a call to $o b j f u n$ is required for each unspecified element) and there is less assurance that an acceptable solution will be found.

If $i = 0$ or $1$ , nlp1_sparse_solve will approximate unspecified elements of the constraint Jacobian. For each column of the Jacobian, one call to $c o n f u n$ is needed to estimate all unspecified elements in that column (if any). For example, if the sparsity pattern of the Jacobian has the form

\begin{matrix} ⎛ ⎜ ⎜ ⎜ ⎝ \begin{matrix} * & * & * ? & ? * & ? * & * \end{matrix} ⎞ ⎟ ⎟ ⎟ ⎠ \end{matrix}

where ‘ $*$ ’ indicates an element provided and ‘?’ indicates an unspecified element, nlp1_sparse_solve will call $c o n f u n$ twice: once to estimate the missing element in column $2$ and again to estimate the two missing elements in column $3$ . (Since columns $1$ and $4$ are known, they require no calls to $c o n f u n$ .)

At times, central differences are used rather than forward differences, in which case twice as many calls to $o b j f u n$ and $c o n f u n$ are needed. (The switch to central differences is not under your control.)

If $i < 0$ or $i > 3$ , the default value is used.

‘Derivative Linesearch’valueless

Default

At each major iteration, a linesearch is used to improve the value of the Lagrangian merit function [equation]. The default linesearch uses safeguarded cubic interpolation and requires both function and gradient values in order to compute estimates of the step $α_{k}$ . If some analytic derivatives are not provided or option ‘Nonderivative Linesearch’ is specified, a linesearch based upon safeguarded quadratic interpolation (which does not require the evaluation or approximation of any gradients) is used instead.

A nonderivative linesearch can be slightly less robust on difficult problems and it is recommended that the default be used if the functions and their derivatives can be computed at approximately the same cost. If the gradients are very expensive to compute relative to the functions however, a nonderivative linesearch may result in a significant decrease in the total run-time.

If option ‘Nonderivative Linesearch’ is selected, nlp1_sparse_solve signals the evaluation of the linesearch by calling $o b j f u n$ and $c o n f u n$ with $m o d e = 0$ . Once the linesearch is complete, the nonlinear functions are re-evaluated with $m o d e = 2$ . If the potential savings offered by a nonderivative linesearch are to be fully realised, it is essential that $o b j f u n$ and $c o n f u n$ be coded so that no derivatives are computed when $m o d e = 0$ .

‘Nonderivative Linesearch’valueless

At each major iteration, a linesearch is used to improve the value of the Lagrangian merit function [equation]. The default linesearch uses safeguarded cubic interpolation and requires both function and gradient values in order to compute estimates of the step $α_{k}$ . If some analytic derivatives are not provided or option ‘Nonderivative Linesearch’ is specified, a linesearch based upon safeguarded quadratic interpolation (which does not require the evaluation or approximation of any gradients) is used instead.

A nonderivative linesearch can be slightly less robust on difficult problems and it is recommended that the default be used if the functions and their derivatives can be computed at approximately the same cost. If the gradients are very expensive to compute relative to the functions however, a nonderivative linesearch may result in a significant decrease in the total run-time.

If option ‘Nonderivative Linesearch’ is selected, nlp1_sparse_solve signals the evaluation of the linesearch by calling $o b j f u n$ and $c o n f u n$ with $m o d e = 0$ . Once the linesearch is complete, the nonlinear functions are re-evaluated with $m o d e = 2$ . If the potential savings offered by a nonderivative linesearch are to be fully realised, it is essential that $o b j f u n$ and $c o n f u n$ be coded so that no derivatives are computed when $m o d e = 0$ .

‘Elastic Weight’float

Default $= 1.0$ or $100.0$

The default value of $r$ is $100.0$ if there are any nonlinear constraints and $1.0$ otherwise.

This option defines the initial weight $γ$ associated with problem [equation].

At any given major iteration $k$ , elastic mode is entered if the QP subproblem is infeasible or the QP dual variables (Lagrange multipliers) are larger in magnitude than $r \times (1 + {∥ g (x_{k}) ∥}_{2})$ , where $g$ is the objective gradient. In either case, the QP subproblem is resolved in elastic mode with $γ = r \times (1 + {∥ g (x_{k}) ∥}_{2})$ .

Thereafter, $γ$ is increased (subject to a maximum allowable value) at any point that is optimal for problem [equation], but not feasible for problem (1). After the $p$ th increase, $γ = r \times 10^{p} \times (1 + {∥ ∥ g (x_{k_{1}}) ∥ ∥}_{2})$ , where $x_{k_{1}}$ is the iterate at which $γ$ was first needed.

If $r < 0$ , the default value is used.

‘Expand Frequency’int

Default $= 10000$

This option is part of the EXPAND anti-cycling procedure due to Gill et al. (1989), which is designed to make progress even on highly degenerate problems.

For linear models, the strategy is to force a positive step at every iteration, at the expense of violating the constraints by a small amount. Suppose that the value of option ‘Minor Feasibility Tolerance’ is $δ$ . Over a period of $i$ iterations, the feasibility tolerance actually used by nlp1_sparse_solve (i.e., the working feasibility tolerance) increases from $0.5 δ$ to $δ$ (in steps of $0.5 δ / i$ ).

For nonlinear models, the same procedure is used for iterations in which there is only one superbasic variable. (Cycling can only occur when the current solution is at a vertex of the feasible region.) Thus, zero steps are allowed if there is more than one superbasic variable, but otherwise positive steps are enforced.

Increasing the value of $i$ helps reduce the number of slightly infeasible nonbasic basic variables (most of which are eliminated during the resetting procedure). However, it also diminishes the freedom to choose a large pivot element (see option ‘Pivot Tolerance’).

If $i < 0$ , the default value is used. If $i = 0$ , the value $i = 99999999$ is used and effectively no anti-cycling procedure is invoked.

‘Factorization Frequency’int

Default $= 50 or 100$

The default value of $i$ is $50$ if there are any nonlinear constraints and $100$ otherwise.

If $i > 0$ , at most $i$ basis changes will occur between factorizations of the basis matrix.

For linear problems, the basis factors are usually updated at every iteration. The default value $i = 100$ is reasonable for typical problems, particularly those that are extremely sparse and well-scaled.

When the objective function is nonlinear, fewer basis updates will occur as the solution is approached. The number of iterations between basis factorizations will, therefore, increase. During these iterations a test is made regularly according to the value of option ‘Check Frequency’ to ensure that the general constraints are satisfied. If necessary, the basis will be refactorized before the limit of $i$ updates is reached.

If $i \leq 0$ , the default value is used.

‘Infeasible Exit’valueless

Default

Note that this option is ignored if the value of option ‘Major Iteration Limit’ is exceeded, or the linear constraints are infeasible.

If termination is about to occur at a point that does not satisfy the nonlinear constraints and option ‘Feasible Exit’ is selected, this option requests that additional iterations be performed in order to find a feasible point (if any) for the nonlinear constraints. This involves solving a feasible point problem in which the objective function is omitted.

Otherwise, this option requests no additional iterations be performed.

‘Feasible Exit’valueless

Note that this option is ignored if the value of option ‘Major Iteration Limit’ is exceeded, or the linear constraints are infeasible.

If termination is about to occur at a point that does not satisfy the nonlinear constraints and option ‘Feasible Exit’ is selected, this option requests that additional iterations be performed in order to find a feasible point (if any) for the nonlinear constraints. This involves solving a feasible point problem in which the objective function is omitted.

Otherwise, this option requests no additional iterations be performed.

‘Minimize’valueless

Default

If option ‘Feasible Point’ is selected, this option attempts to find a feasible point (if any) for the nonlinear constraints by omitting the objective function. It can also be used to check whether the nonlinear constraints are feasible.

Otherwise, this option specifies the required direction of the optimization. It applies to both linear and nonlinear terms (if any) in the objective function. Note that if two problems are the same except that one minimizes $f (x)$ and the other maximizes $- f (x)$ , their solutions will be the same but the signs of the dual variables $π_{i}$ and the reduced gradients $d_{j}$ will be reversed.

‘Maximize’valueless

If option ‘Feasible Point’ is selected, this option attempts to find a feasible point (if any) for the nonlinear constraints by omitting the objective function. It can also be used to check whether the nonlinear constraints are feasible.

Otherwise, this option specifies the required direction of the optimization. It applies to both linear and nonlinear terms (if any) in the objective function. Note that if two problems are the same except that one minimizes $f (x)$ and the other maximizes $- f (x)$ , their solutions will be the same but the signs of the dual variables $π_{i}$ and the reduced gradients $d_{j}$ will be reversed.

‘Feasible Point’valueless

If option ‘Feasible Point’ is selected, this option attempts to find a feasible point (if any) for the nonlinear constraints by omitting the objective function. It can also be used to check whether the nonlinear constraints are feasible.

Otherwise, this option specifies the required direction of the optimization. It applies to both linear and nonlinear terms (if any) in the objective function. Note that if two problems are the same except that one minimizes $f (x)$ and the other maximizes $- f (x)$ , their solutions will be the same but the signs of the dual variables $π_{i}$ and the reduced gradients $d_{j}$ will be reversed.

‘Forward Difference Interval’float

Default $= \sqrt{‘Function Precision'}$

This option defines an interval used to estimate derivatives by forward differences in the following circumstances:

For verifying the objective and/or constraint gradients (see the description of the option ‘Verify Level’).
For estimating unspecified elements of the objective gradient and/or the constraint Jacobian.

A derivative with respect to $x_{j}$ is estimated by perturbing that element of $x$ to the value $x_{j} + r (1 + ∣ ∣ x_{j} ∣ ∣)$ and then evaluating $f (x)$ and/or $F (x)$ (as appropriate) at the perturbed point. The resulting gradient estimates should be accurate to $O (r)$ , unless the functions are badly scaled. Judicious alteration of $r$ may sometimes lead to greater accuracy. See Gill et al. (1981) for a discussion of the accuracy in finite difference approximations.

If $r \leq 0$ , the default value is used.

‘Function Precision’float

Default $= ϵ^{0.8}$

This argument defines the relative function precision $ϵ_{r}$ , which is intended to be a measure of the relative accuracy with which the nonlinear functions can be computed. For example, if $f (x)$ (or $F_{i} (x)$ ) is computed as $1000.56789$ for some relevant $x$ and the first $6$ significant digits are known to be correct, then the appropriate value for $ϵ_{r}$ would be $10^{- 6}$ .

Ideally the functions $f (x)$ or $F_{i} (x)$ should have magnitude of order $1$ . If all functions are substantially less than $1$ in magnitude, $ϵ_{r}$ should be the absolute precision. For example, if $f (x)$ (or $F_{i} (x)$ ) is computed as $1.23456789 \times 10^{- 4}$ for some relevant $x$ and the first $6$ significant digits are known to be correct, then the appropriate value for $ϵ_{r}$ would be $10^{- 10}$ .

The choice of $ϵ_{r}$ can be quite complicated for badly scaled problems; see Module 8 of Gill et al. (1981) for a discussion of scaling techniques. The default value is appropriate for most simple functions that are computed with full accuracy.

In some cases the function values will be the result of extensive computation, possibly involving an iterative procedure that can provide few digits of precision at reasonable cost. Specifying an appropriate value of $r$ may, therefore, lead to savings, by allowing the linesearch procedure to terminate when the difference between function values along the search direction becomes as small as the absolute error in the values.

If $r < ϵ$ or $r \geq 1$ , the default value is used.

‘Hessian Frequency’int

Default $= 99999999$

This option forces the approximate Hessian formed from $i$ BFGS updates to be reset to the identity matrix upon completion of a major iteration. It is intended to be used in conjunction with option ‘Hessian Full Memory’.

If $i \leq 0$ , the default value is used and effectively no resets occur.

‘Hessian Full Memory’valueless

Default when $¯ n < 75$

These options specify the method for storing and updating the quasi-Newton approximation to the Hessian of the Lagrangian function.

If ‘Hessian Full Memory’ is specified, the approximate Hessian is treated as a dense matrix and BFGS quasi-Newton updates are applied explicitly. This is most efficient when the total number of nonlinear variables is not too large (say, $¯ n < 75$ ). In this case, the storage requirement is fixed and you can expect $1$ -step Q-superlinear convergence to the solution.

‘Hessian Limited Memory’ should only be specified when $¯ n$ is very large. In this case a limited memory procedure is used to update a diagonal Hessian approximation $H_{r}$ a limited number of times. (Updates are accumulated as a list of vector pairs. They are discarded at regular intervals after $H_{r}$ has been reset to their diagonal.)

Note that if $‘Hessian Frequency' = 20$ is used in conjunction with ‘Hessian Full Memory’, the effect will be similar to using ‘Hessian Limited Memory’ in conjunction with $‘Hessian Updates' = 20$ , except that the latter will retain the current diagonal during resets.

‘Hessian Limited Memory’valueless

Default when $¯ n \geq 75$

These options specify the method for storing and updating the quasi-Newton approximation to the Hessian of the Lagrangian function.

If ‘Hessian Full Memory’ is specified, the approximate Hessian is treated as a dense matrix and BFGS quasi-Newton updates are applied explicitly. This is most efficient when the total number of nonlinear variables is not too large (say, $¯ n < 75$ ). In this case, the storage requirement is fixed and you can expect $1$ -step Q-superlinear convergence to the solution.

‘Hessian Limited Memory’ should only be specified when $¯ n$ is very large. In this case a limited memory procedure is used to update a diagonal Hessian approximation $H_{r}$ a limited number of times. (Updates are accumulated as a list of vector pairs. They are discarded at regular intervals after $H_{r}$ has been reset to their diagonal.)

Note that if $‘Hessian Frequency' = 20$ is used in conjunction with ‘Hessian Full Memory’, the effect will be similar to using ‘Hessian Limited Memory’ in conjunction with $‘Hessian Updates' = 20$ , except that the latter will retain the current diagonal during resets.

‘Hessian Updates’int

Default $= 20$ or $99999999$

The default value of $i$ is $20$ when ‘Hessian Limited Memory’ is in effect and $99999999$ when ‘Hessian Full Memory’ is in effect, in which case no updates are performed.

If ‘Hessian Limited Memory’ is in effect, this option defines the maximum number of pairs of Hessian update vectors that are to be used to define the quasi-Newton approximate Hessian. Once the limit of $i$ updates is reached, all but the diagonal elements of the accumulated updates are discarded and the process starts again. Broadly speaking, the more updates that are stored, the better the quality of the approximate Hessian. On the other hand, the more vectors that are stored, the greater the cost of each QP iteration.

The default value of $i$ is likely to give a robust algorithm without significant expense, but faster convergence may be obtained with far fewer updates (e.g., $i = 5$ ).

If $i < 0$ , the default value is used.

‘Infinite Bound Size’float

Default $= 10^{20}$

If $r > 0$ , $r$ defines the ‘infinite’ bound $bigbnd$ in the definition of the problem constraints. Any upper bound greater than or equal to $bigbnd$ will be regarded as $+ \infty$ (and similarly any lower bound less than or equal to $- bigbnd$ will be regarded as $- \infty$ ).

If $r \leq 0$ , the default value is used.

‘Iteration Limit’int

Default $= 10000$

The value of $i$ specifies the maximum number of minor iterations allowed (i.e., iterations of the simplex method or the QP algorithm), summed over all major iterations. (See also the description of the options ‘Major Iteration Limit’ and ‘Minor Iteration Limit’.)

If $i < 0$ , the default value is used.

‘Linesearch Tolerance’float

Default $= 0.9$

This option controls the accuracy with which a step length will be located along the direction of search at each iteration. At the start of each linesearch a target directional derivative for the Lagrangian merit function is identified. The value of $r$ , therefore, determines the accuracy to which this target value is approximated.

The default value $r = 0.9$ requests an inaccurate search and is appropriate for most problems, particularly those with any nonlinear constraints.

If the nonlinear functions are cheap to evaluate, a more accurate search may be appropriate; try $r = 0.1, 0.01$ or $0.001$ . The number of major iterations required to solve the problem might decrease.

If the nonlinear functions are expensive to evaluate, a less accurate search may be appropriate. If $‘Derivative Level' = 3$ , try $r = 0.99$ . (The number of major iterations required to solve the problem might increase, but the total number of function evaluations may decrease enough to compensate.)

If $‘Derivative Level' < 3$ , a moderately accurate search may be appropriate; try $r = 0.5$ . Each search will (typically) require only $1 - 5$ function values, but many function calls will then be needed to estimate the missing gradients for the next iteration.

If $r < 0$ or $r \geq 1$ , the default value is used.

‘List’valueless

Option ‘List’ enables printing of each option specification as it is supplied. ‘Nolist’ suppresses this printing.

‘Nolist’valueless

Default $= ‘Nolist'$

Option ‘List’ enables printing of each option specification as it is supplied. ‘Nolist’ suppresses this printing.

‘LU Density Tolerance’float

Default $= 0.6$

If $r_{1} > 0$ , $r_{1}$ defines the density tolerance used during the $L U$ factorization of the basis matrix. Columns of $L$ and rows of $U$ are formed one at a time and the remaining rows and columns of the basis are altered appropriately. At any stage, if the density of the remaining matrix exceeds $r_{1}$ , the Markowitz strategy for choosing pivots is terminated. The remaining matrix is then factorized using a dense $L U$ procedure. Increasing the value of $r_{1}$ towards unity may give slightly sparser $L U$ factors, with a slight increase in factorization time. If $r_{1} \leq 0$ , the default value is used.

If $r_{2} > 0$ , $r_{2}$ defines the singularity tolerance used to guard against ill-conditioned basis matrices. Whenever the basis is refactorized, the diagonal elements of $U$ are tested as follows. If $∣ ∣ u_{j j} ∣ ∣ \leq r_{2}$ or $∣ ∣ u_{j j} ∣ ∣ < r_{2} \times {m a x}_{i} ∣ ∣ u_{i j} ∣ ∣$ , the $j$ th column of the basis is replaced by the corresponding slack variable. This is most likely to occur when $s t a r t ='W'$ (see Parameters), or at the start of a major iteration. If $r_{2} \leq 0$ , the default value is used.

In some cases, the Jacobian matrix may converge to values that make the basis exactly singular (e.g., a whole row of the Jacobian matrix could be zero at an optimal solution). Before exact singularity occurs, the basis could become very ill-conditioned and the optimization could progress very slowly (if at all). Setting $r_{2} = 0.00001$ (say) may, therefore, help cause a judicious change of basis in such situations.

‘LU Singularity Tolerance’float

Default $= ϵ^{0.67}$

If $r_{1} > 0$ , $r_{1}$ defines the density tolerance used during the $L U$ factorization of the basis matrix. Columns of $L$ and rows of $U$ are formed one at a time and the remaining rows and columns of the basis are altered appropriately. At any stage, if the density of the remaining matrix exceeds $r_{1}$ , the Markowitz strategy for choosing pivots is terminated. The remaining matrix is then factorized using a dense $L U$ procedure. Increasing the value of $r_{1}$ towards unity may give slightly sparser $L U$ factors, with a slight increase in factorization time. If $r_{1} \leq 0$ , the default value is used.

If $r_{2} > 0$ , $r_{2}$ defines the singularity tolerance used to guard against ill-conditioned basis matrices. Whenever the basis is refactorized, the diagonal elements of $U$ are tested as follows. If $∣ ∣ u_{j j} ∣ ∣ \leq r_{2}$ or $∣ ∣ u_{j j} ∣ ∣ < r_{2} \times {m a x}_{i} ∣ ∣ u_{i j} ∣ ∣$ , the $j$ th column of the basis is replaced by the corresponding slack variable. This is most likely to occur when $s t a r t ='W'$ (see Parameters), or at the start of a major iteration. If $r_{2} \leq 0$ , the default value is used.

In some cases, the Jacobian matrix may converge to values that make the basis exactly singular (e.g., a whole row of the Jacobian matrix could be zero at an optimal solution). Before exact singularity occurs, the basis could become very ill-conditioned and the optimization could progress very slowly (if at all). Setting $r_{2} = 0.00001$ (say) may, therefore, help cause a judicious change of basis in such situations.

‘LU Factor Tolerance’float

Default $= 5.0$ or $100.0$

The default value of $r_{1}$ is $5.0$ if there are any nonlinear constraints and $100.0$ otherwise. The default value of $r_{2}$ is $5.0$ if there are any nonlinear constraints and $10.0$ otherwise.

If $r_{1} \geq 1$ and $r_{2} \geq 1$ , the values of $r_{1}$ and $r_{2}$ affect the stability and sparsity of the basis factorization $B = L U$ , during refactorization and updating, respectively. The lower triangular matrix $L$ is a product of matrices of the form

\begin{matrix} (\begin{matrix} 1 μ & 1 \end{matrix}), \end{matrix}

where the multipliers $μ$ satisfy $| μ | \leq r_{i}$ . Smaller values of $r_{i}$ favour stability, while larger values favour sparsity. The default values of $r_{1}$ and $r_{2}$ usually strike a good compromise. For large and relatively dense problems, setting $r_{1} = 10.0$ or $5.0$ (say) may give a marked improvement in sparsity without impairing stability to a serious degree. Note that for problems involving band matrices, it may be necessary to reduce $r_{1}$ and/or $r_{2}$ in order to achieve stability.

If $r_{1} < 1$ or $r_{2} < 1$ , the appropriate default value is used.

‘LU Update Tolerance’float

Default $= 5.0$ or $10.0$

The default value of $r_{1}$ is $5.0$ if there are any nonlinear constraints and $100.0$ otherwise. The default value of $r_{2}$ is $5.0$ if there are any nonlinear constraints and $10.0$ otherwise.

If $r_{1} \geq 1$ and $r_{2} \geq 1$ , the values of $r_{1}$ and $r_{2}$ affect the stability and sparsity of the basis factorization $B = L U$ , during refactorization and updating, respectively. The lower triangular matrix $L$ is a product of matrices of the form

\begin{matrix} (\begin{matrix} 1 μ & 1 \end{matrix}), \end{matrix}

where the multipliers $μ$ satisfy $| μ | \leq r_{i}$ . Smaller values of $r_{i}$ favour stability, while larger values favour sparsity. The default values of $r_{1}$ and $r_{2}$ usually strike a good compromise. For large and relatively dense problems, setting $r_{1} = 10.0$ or $5.0$ (say) may give a marked improvement in sparsity without impairing stability to a serious degree. Note that for problems involving band matrices, it may be necessary to reduce $r_{1}$ and/or $r_{2}$ in order to achieve stability.

If $r_{1} < 1$ or $r_{2} < 1$ , the appropriate default value is used.

‘Major Feasibility Tolerance’float

Default $= \sqrt{ϵ}$

This option specifies how accurately the nonlinear constraints should be satisfied. The default value is appropriate when the linear and nonlinear constraints contain data to approximately that accuracy. A larger value may be appropriate if some of the problem functions are known to be of low accuracy.

Let rowerr be defined as the maximum nonlinear constraint violation normalized by the size of the solution. It is required to satisfy

rowerr = {m a x}_{i} \frac{{viol}_{i}}{∥ (x, s) ∥} \leq r,

where ${viol}_{i}$ is the violation of the $i$ th nonlinear constraint.

If $r \leq ϵ$ , the default value is used.

‘Major Iteration Limit’int

Default $= 1000$

The value of $i$ specifies the maximum number of major iterations allowed before termination. It is intended to guard against an excessive number of linearizations of the nonlinear constraints. Setting $i = 0$ and $‘Major Print Level' > 0$ means that the objective and constraint gradients will be checked if $‘Verify Level' > 0$ and the workspace needed to start solving the problem will be computed and printed, but no iterations will be performed.

If $i < 0$ , the default value is used.

‘Major Optimality Tolerance’float

Default $= \sqrt{ϵ}$

This option specifies the final accuracy of the dual variables. If nlp1_sparse_solve terminates with no exception or warning is raised, a primal and dual solution ( $x, s, π$ ) will have been computed such that

maxgap = {m a x}_{j} \frac{{gap}_{j}}{∥ π ∥} \leq r,

where ${gap}_{j}$ is an estimate of the complementarity gap for the $j$ th variable and $∥ π ∥$ is a measure of the size of the QP dual variables (or Lagrange multipliers) given by

∥ π ∥ = m a x (\frac{σ}{\sqrt{m}}, 1), where σ = m \sum i = 1 | π_{i} | .

It is included to make the tests independent of a scale factor on the objective function. Specifically, ${gap}_{j}$ is computed from the final QP solution using the reduced gradients $d_{j} = g_{j} - π^{T} a_{j}$ , where $g_{j}$ is the $j$ th element of the objective gradient and $a_{j}$ is the associated column of the constraint matrix $(\begin{matrix} A & - I \end{matrix})$ :

\begin{matrix} {gap}_{j} = {\begin{matrix} d_{j} m i n (x_{j} - l_{j}, 1) & if d_{j} \geq 0; - d_{j} m i n (u_{j} - x_{j}, 1) & if d_{j} < 0 . \end{matrix} \end{matrix}

If $r \leq 0$ , the default value is used.

‘Optimality Tolerance’float

Default $= \sqrt{ϵ}$

This option specifies the final accuracy of the dual variables. If nlp1_sparse_solve terminates with no exception or warning is raised, a primal and dual solution ( $x, s, π$ ) will have been computed such that

maxgap = {m a x}_{j} \frac{{gap}_{j}}{∥ π ∥} \leq r,

where ${gap}_{j}$ is an estimate of the complementarity gap for the $j$ th variable and $∥ π ∥$ is a measure of the size of the QP dual variables (or Lagrange multipliers) given by

∥ π ∥ = m a x (\frac{σ}{\sqrt{m}}, 1), where σ = m \sum i = 1 | π_{i} | .

It is included to make the tests independent of a scale factor on the objective function. Specifically, ${gap}_{j}$ is computed from the final QP solution using the reduced gradients $d_{j} = g_{j} - π^{T} a_{j}$ , where $g_{j}$ is the $j$ th element of the objective gradient and $a_{j}$ is the associated column of the constraint matrix $(\begin{matrix} A & - I \end{matrix})$ :

\begin{matrix} {gap}_{j} = {\begin{matrix} d_{j} m i n (x_{j} - l_{j}, 1) & if d_{j} \geq 0; - d_{j} m i n (u_{j} - x_{j}, 1) & if d_{j} < 0 . \end{matrix} \end{matrix}

If $r \leq 0$ , the default value is used.

‘Major Print Level’int

Default $= 0$

The value of $i$ controls the amount of printout produced by the major iterations of nlp1_sparse_solve, as indicated below. A detailed description of the printed output is given in Major Iteration Printout (summary output at each major iteration and the final solution) and Monitoring Information (monitoring information at each major iteration). (See also the description of the option ‘Minor Print Level’.)

The following printout is sent to the file object associated with the advisory I/O unit (see FileObjManager):

$i$	Output
$0$	No output.
$1$	The final solution only.
$5$	One line of summary output ( $< 80$ characters; see Major Iteration Printout) for each major iteration (no printout of the final solution).
$\geq 10$	The final solution and one line of summary output for each major iteration.

The following printout is sent to the unit number given by the option ‘Monitoring File’:

$i$	Output
$0$	No output.
$1$	The final solution only.
$5$	One long line of output ( $< 120$ characters; see Monitoring Information) for each major iteration (no printout of the final solution).
$\geq 10$	The final solution and one long line of output for each major iteration.
$\geq 20$	The final solution, one long line of output for each major iteration, matrix statistics (initial status of rows and columns, number of elements, density, biggest and smallest elements, etc.), details of the scale factors resulting from the scaling procedure (if $‘Scale Option' = 1$ or $2$ ), basis factorization statistics and details of the initial basis resulting from the Crash procedure (if $s t a r t ='C'$ ; see Parameters).

If $‘Major Print Level' \geq 5$ and the unit number defined by the option ‘Monitoring File’ is the advisory unit number, the summary output for each major iteration is suppressed.

‘Print Level’int

The value of $i$ controls the amount of printout produced by the major iterations of nlp1_sparse_solve, as indicated below. A detailed description of the printed output is given in Major Iteration Printout (summary output at each major iteration and the final solution) and Monitoring Information (monitoring information at each major iteration). (See also the description of the option ‘Minor Print Level’.)

The following printout is sent to the file object associated with the advisory I/O unit (see FileObjManager):

$i$	Output
$0$	No output.
$1$	The final solution only.
$5$	One line of summary output ( $< 80$ characters; see Major Iteration Printout) for each major iteration (no printout of the final solution).
$\geq 10$	The final solution and one line of summary output for each major iteration.

The following printout is sent to the unit number given by the option ‘Monitoring File’:

$i$	Output
$0$	No output.
$1$	The final solution only.
$5$	One long line of output ( $< 120$ characters; see Monitoring Information) for each major iteration (no printout of the final solution).
$\geq 10$	The final solution and one long line of output for each major iteration.
$\geq 20$	The final solution, one long line of output for each major iteration, matrix statistics (initial status of rows and columns, number of elements, density, biggest and smallest elements, etc.), details of the scale factors resulting from the scaling procedure (if $‘Scale Option' = 1$ or $2$ ), basis factorization statistics and details of the initial basis resulting from the Crash procedure (if $s t a r t ='C'$ ; see Parameters).

If $‘Major Print Level' \geq 5$ and the unit number defined by the option ‘Monitoring File’ is the advisory unit number, the summary output for each major iteration is suppressed.

‘Major Step Limit’float

Default $= 2.0$

If $r > 0, r$ limits the change in $x$ during a linesearch. It applies to all nonlinear problems once a ‘feasible solution’ or ‘feasible subproblem’ has been found.

A linesearch determines a step $α$ in the interval $0 < α \leq β$ , where $β = 1$ if there are any nonlinear constraints, or the step to the nearest upper or lower bound on $x$ if all the constraints are linear. Normally, the first step attempted is $α_{1} = m i n (1, β)$ .

In some cases, such as $f (x) = a e^{b x}$ or $f (x) = a x^{b}$ , even a moderate change in the elements of $x$ can lead to floating-point overflow. The argument $r$ is, therefore, used to define a step limit $¯ β$ given by

¯ β = \frac{r (1 + {∥ x ∥}_{2})}{{∥ p ∥}_{2}},

where $p$ is the search direction and the first evaluation of $f (x)$ is made at the (potentially) smaller step length $α_{1} = m i n (1, ¯ β, β)$ .

Wherever possible, upper and lower bounds on $x$ should be used to prevent evaluation of nonlinear functions at meaningless points. The default value $r = 2.0$ should not affect progress on well-behaved functions, but values such as $r = 0.1$ or $0.01$ may be helpful when rapidly varying functions are present. If a small value of $r$ is selected, a ‘good’ starting point may be required. An important application is to the class of nonlinear least squares problems.

If $r \leq 0$ , the default value is used.

‘Minor Feasibility Tolerance’float

Default $= \sqrt{ϵ}$

This option attempts to ensure that all variables eventually satisfy their upper and lower bounds to within the tolerance $r$ . Since this includes slack variables, general linear constraints should also be satisfied to within $r$ . Note that feasibility with respect to nonlinear constraints is judged by the value of option ‘Major Feasibility Tolerance’ and not by $r$ .

If the bounds and linear constraints cannot be satisfied to within $r$ , the problem is declared infeasible. Let Sinf be the corresponding sum of infeasibilities. If Sinf is quite small, it may be appropriate to raise $r$ by a factor of $10$ or $100$ . Otherwise, some error in the data should be suspected.

If $‘Scale Option' \geq 1$ , feasibility is defined in terms of the scaled problem (since it is more likely to be meaningful).

Nonlinear functions will only be evaluated at points that satisfy the bounds and linear constraints. If there are regions where a function is undefined, every effort should be made to eliminate these regions from the problem. For example, if $f (x_{1}, x_{2}) = \sqrt{x_{1}} + log (x_{2})$ , it is essential to place lower bounds on both $x_{1}$ and $x_{2}$ . If the value $r = 10^{- 6}$ is used, the bounds $x_{1} \geq 10^{- 5}$ and $x_{2} \geq 10^{- 4}$ might be appropriate. (The log singularity is more serious; in general, you should attempt to keep $x$ as far away from singularities as possible.)

In reality, $r$ is used as a feasibility tolerance for satisfying the bounds on $x$ and $s$ in each QP subproblem. If the sum of infeasibilities cannot be reduced to zero, the QP subproblem is declared infeasible and the function is, then in elastic mode thereafter (with only the linearized nonlinear constraints defined to be elastic). (See also the description of ‘Elastic Weight’.)

If $r \leq ϵ$ , the default value is used.

‘Feasibility Tolerance’float

Default $= \sqrt{ϵ}$

This option attempts to ensure that all variables eventually satisfy their upper and lower bounds to within the tolerance $r$ . Since this includes slack variables, general linear constraints should also be satisfied to within $r$ . Note that feasibility with respect to nonlinear constraints is judged by the value of option ‘Major Feasibility Tolerance’ and not by $r$ .

If the bounds and linear constraints cannot be satisfied to within $r$ , the problem is declared infeasible. Let Sinf be the corresponding sum of infeasibilities. If Sinf is quite small, it may be appropriate to raise $r$ by a factor of $10$ or $100$ . Otherwise, some error in the data should be suspected.

If $‘Scale Option' \geq 1$ , feasibility is defined in terms of the scaled problem (since it is more likely to be meaningful).

Nonlinear functions will only be evaluated at points that satisfy the bounds and linear constraints. If there are regions where a function is undefined, every effort should be made to eliminate these regions from the problem. For example, if $f (x_{1}, x_{2}) = \sqrt{x_{1}} + log (x_{2})$ , it is essential to place lower bounds on both $x_{1}$ and $x_{2}$ . If the value $r = 10^{- 6}$ is used, the bounds $x_{1} \geq 10^{- 5}$ and $x_{2} \geq 10^{- 4}$ might be appropriate. (The log singularity is more serious; in general, you should attempt to keep $x$ as far away from singularities as possible.)

In reality, $r$ is used as a feasibility tolerance for satisfying the bounds on $x$ and $s$ in each QP subproblem. If the sum of infeasibilities cannot be reduced to zero, the QP subproblem is declared infeasible and the function is, then in elastic mode thereafter (with only the linearized nonlinear constraints defined to be elastic). (See also the description of ‘Elastic Weight’.)

If $r \leq ϵ$ , the default value is used.

‘Minor Iteration Limit’int

Default $= 500$

The value of $i$ specifies the maximum number of iterations allowed between successive linearizations of the nonlinear constraints. A value in the range $10 \leq i \leq 50$ prevents excessive effort being expended on early major iterations, but allows later QP subproblems to be solved to completion. Note that an extra $m$ minor iterations are allowed if the first QP subproblem to be solved starts with the all-slack basis $B = I$ . (See the description of the option ‘Crash Option’.)

In general, it is unsafe to specify values as small as $i = 1$ or $2$ (because even when an optimal solution has been reached, a few minor iterations may be needed for the corresponding QP subproblem to be recognized as optimal).

If $i \leq 0$ , the default value is used.

‘Minor Optimality Tolerance’float

Default $= \sqrt{ϵ}$

This option is used to judge optimality for each QP subproblem. Let the QP reduced gradients be $d_{j} = g_{j} - π^{T} a_{j}$ , where $g_{j}$ is the $j$ th element of the QP gradient, $a_{j}$ is the associated column of the QP constraint matrix and $π$ is the set of QP dual variables.

By construction, the reduced gradients for basic variables are always zero. The QP subproblem will be declared optimal if the reduced gradients for nonbasic variables at their upper or lower bounds satisfy

\frac{d_{j}}{∥ π ∥} \geq - r or \frac{d_{j}}{∥ π ∥} \leq r

respectively, and if $\frac{∣ ∣ d_{j} ∣ ∣}{∥ π ∥} \leq r$ for superbasic variables.

Note that $∥ π ∥$ is a measure of the size of the dual variables. It is included to make the tests independent of a scale factor on the objective function. (The value of $∥ π ∥$ actually used is defined in the description for option ‘Major Optimality Tolerance’.)

If the objective is scaled down to be very small, the optimality test reduces to comparing $d_{j}$ against $r$ .

If $r \leq 0$ , the default value is used.

‘Minor Print Level’int

Default $= 0$

The value of $i$ controls the amount of printout produced by the minor iterations of nlp1_sparse_solve (i.e., the iterations of the quadratic programming algorithm), as indicated below. A detailed description of the printed output is given in Minor Iteration Printout (summary output at each minor iteration) and Monitoring Information (monitoring information at each minor iteration). (See also the description of the option ‘Major Print Level’.)

The following printout is sent to the file object associated with the advisory I/O unit (see FileObjManager):

$i$	Output
$0$	No output.
$\geq 1$	One line of summary output ( $< 80$ characters; see Minor Iteration Printout) for each minor iteration.

The following printout is sent to the unit number given by the option ‘Monitoring File’:

$i$	Output
$0$	No output.
$\geq 1$	One long line of output ( $< 120$ characters; see Monitoring Information) for each minor iteration.

If $‘Major Print Level' \geq 5$ and the unit number defined by the option ‘Monitoring File’ is the advisory unit number, the summary output for each major iteration is suppressed.

‘Monitoring File’int

Default $= - 1$

If $i \geq 0$ and $‘Major Print Level' \geq 5$ or $i \geq 0$ and $‘Minor Print Level' \geq 1$ , then monitoring information is produced by nlp1_sparse_solve at every iteration is sent to a file with logical unit number $i$ . If $i < 0$ and/or $‘Major Print Level' < 5$ and $‘Minor Print Level' < 1$ , then no monitoring information is produced.

‘Partial Price’int

Default $= 1$ or $10$

The default value of $i$ is $1$ if there are any nonlinear constraints and $10$ otherwise.

This option is recommended for large problems that have significantly more variables than constraints (i.e., $n ≫ m$ ). It reduces the work required for each ‘pricing’ operation (i.e., when a nonbasic variable is selected to become superbasic). The possible choices for $i$ are the following.

$i$

Meaning

$1$

All columns of the constraint matrix $(\begin{matrix} A & - I \end{matrix})$ are searched.

$\geq 2$

Both $A$ and $I$ are partitioned to give $i$ roughly equal segments $A_{j}, I_{j}$ , for $j = 1, 2, \dots, p$ (modulo $p$ ). If the previous pricing search was successful on $A_{j}, I_{j}$ , the next search begins on the segments $A_{j + 1}, I_{j + 1}$ . If a reduced gradient is found that is larger than some dynamic tolerance, the variable with the largest such reduced gradient (of appropriate sign) is selected to enter the basis. If nothing is found, the search continues on the next segments $A_{j + 2}, I_{j + 2}$ and so on.

If $i \leq 0$ , the default value is used.

‘Pivot Tolerance’float

Default $= ϵ^{0.67}$

If $r > 0$ , $r$ is used during the solution of QP subproblems to prevent columns entering the basis if they would cause the basis to become almost singular.

When $x$ changes to $x + α p$ for some specified search direction $p$ , a ‘ratio test’ is used to determine which element of $x$ reaches an upper or lower bound first. The corresponding element of $p$ is called the pivot element. Elements of $p$ are ignored (and, therefore, cannot be pivot elements) if they are smaller than $r$ .

It is common in practice for two (or more) variables to reach a bound at essentially the same time. In such cases, the ‘Minor Feasibility Tolerance’ provides some freedom to maximize the pivot element and thereby improve numerical stability. Excessively small values of ‘Minor Feasibility Tolerance’ should, therefore, not be specified. To a lesser extent, the ‘Expand Frequency’ also provides some freedom to maximize the pivot element. Excessively large values of ‘Expand Frequency’ should, therefore, not be specified.

If $r \leq 0$ , the default value is used.

‘Scale Option’int

Default $= 1$ or $2$

The default value of $i$ is $1$ if there are any nonlinear constraints and $2$ otherwise.

This option enables you to scale the variables and constraints using an iterative procedure due to Fourer (1982), which attempts to compute row scales $r_{i}$ and column scales $c_{j}$ such that the scaled matrix coefficients ${¯ a}_{i j} = a_{i j} \times (c_{j} / r_{i})$ are as close as possible to unity. (The lower and upper bounds on the variables and slacks for the scaled problem are redefined as ${¯ l}_{j} = l_{j} / c_{j}$ and ${¯ u}_{j} = u_{j} / c_{j}$ respectively, where $c_{j} \equiv r_{j - n}$ if $j > n$ .) The possible choices for $i$ are the following.

$i$	Meaning
0	No scaling is performed. This is recommended if it is known that the elements of $x$ and the constraint matrix $A$ (along with its Jacobian) never become large (say, $> 1000$ ).
1	All linear constraints and variables are scaled. This may improve the overall efficiency of the function on some problems.
2	All constraints and variables are scaled. Also, an additional scaling is performed that takes into account columns of $(\begin{matrix} A & - I \end{matrix})$ that are fixed or have positive lower bounds or negative upper bounds.

If there are any nonlinear constraints present, the scale factors depend on the Jacobian at the first point that satisfies the linear constraints and the upper and lower bounds. The setting $i = 2$ should, therefore, be used only if a ‘good’ starting point is available and the problem is not highly nonlinear.

If $i < 0$ or $i > 2$ , the default value is used.

‘Scale Tolerance’float

Default $= 0.9$

Note that this option does not apply when $‘Scale Option' = 0$ .

The value $r$ ( $0 < r < 1$ ) is used to control the number of scaling passes to be made through the constraint matrix $A$ . At least $3$ (and at most $10$ ) passes will be made. More precisely, let $a_{p}$ denote the largest column ratio (i.e., $\frac{‘biggest' element}{‘smallest' element}$ in some sense) after the $p$ th scaling pass through $A$ . The scaling procedure is terminated if $a_{p} \geq a_{p - 1} \times r$ for some $p \geq 3$ . Thus, increasing the value of $r$ from $0.9$ to $0.99$ (say) will probably increase the number of passes through $A$ .

If $r \leq 0$ or $r \geq 1$ , the default value is used.

‘Start Objective Check At Column’int

Default $= 1$

These keywords take effect only if $‘Verify Level' > 0$ . They may be used to control the verification of gradient elements computed by $o b j f u n$ and/or Jacobian elements computed by $c o n f u n$ . For example, if the first $30$ elements of the objective gradient appeared to be correct in an earlier run, so that only element $31$ remains questionable, then it is reasonable to specify $‘Start Objective Check At Column' = 31$ . Similarly for columns of the Jacobian. If the first $30$ variables occur nonlinearly in the constraints but the remaining variables are nonlinear only in the objective, then $o b j f u n$ must set the first $30$ elements of the array $o b j g r d$ to zero, but these hardly need to be verified. Again it is reasonable to specify $‘Start Objective Check At Column' = 31$ .

If $i_{2} \leq 0$ or $i_{2} > n_{1}^{'}$ , the default value is used.

If $i_{1} \leq 0$ or $i_{1} > m i n (n_{1}^{'}, i_{2})$ , the default value is used.

If $i_{4} \leq 0$ or $i_{4} > n_{1}^{''}$ , the default value is used.

If $i_{3} \leq 0$ or $i_{3} > m i n (n_{1}^{''}, i_{4})$ , the default value is used.

‘Stop Objective Check At Column’int

Default $= n_{1}^{'}$

These keywords take effect only if $‘Verify Level' > 0$ . They may be used to control the verification of gradient elements computed by $o b j f u n$ and/or Jacobian elements computed by $c o n f u n$ . For example, if the first $30$ elements of the objective gradient appeared to be correct in an earlier run, so that only element $31$ remains questionable, then it is reasonable to specify $‘Start Objective Check At Column' = 31$ . Similarly for columns of the Jacobian. If the first $30$ variables occur nonlinearly in the constraints but the remaining variables are nonlinear only in the objective, then $o b j f u n$ must set the first $30$ elements of the array $o b j g r d$ to zero, but these hardly need to be verified. Again it is reasonable to specify $‘Start Objective Check At Column' = 31$ .

If $i_{2} \leq 0$ or $i_{2} > n_{1}^{'}$ , the default value is used.

If $i_{1} \leq 0$ or $i_{1} > m i n (n_{1}^{'}, i_{2})$ , the default value is used.

If $i_{4} \leq 0$ or $i_{4} > n_{1}^{''}$ , the default value is used.

If $i_{3} \leq 0$ or $i_{3} > m i n (n_{1}^{''}, i_{4})$ , the default value is used.

‘Start Constraint Check At Column’int

Default $= 1$

These keywords take effect only if $‘Verify Level' > 0$ . They may be used to control the verification of gradient elements computed by $o b j f u n$ and/or Jacobian elements computed by $c o n f u n$ . For example, if the first $30$ elements of the objective gradient appeared to be correct in an earlier run, so that only element $31$ remains questionable, then it is reasonable to specify $‘Start Objective Check At Column' = 31$ . Similarly for columns of the Jacobian. If the first $30$ variables occur nonlinearly in the constraints but the remaining variables are nonlinear only in the objective, then $o b j f u n$ must set the first $30$ elements of the array $o b j g r d$ to zero, but these hardly need to be verified. Again it is reasonable to specify $‘Start Objective Check At Column' = 31$ .

If $i_{2} \leq 0$ or $i_{2} > n_{1}^{'}$ , the default value is used.

If $i_{1} \leq 0$ or $i_{1} > m i n (n_{1}^{'}, i_{2})$ , the default value is used.

If $i_{4} \leq 0$ or $i_{4} > n_{1}^{''}$ , the default value is used.

If $i_{3} \leq 0$ or $i_{3} > m i n (n_{1}^{''}, i_{4})$ , the default value is used.

‘Stop Constraint Check At Column’int

Default $= n_{1}^{''}$

These keywords take effect only if $‘Verify Level' > 0$ . They may be used to control the verification of gradient elements computed by $o b j f u n$ and/or Jacobian elements computed by $c o n f u n$ . For example, if the first $30$ elements of the objective gradient appeared to be correct in an earlier run, so that only element $31$ remains questionable, then it is reasonable to specify $‘Start Objective Check At Column' = 31$ . Similarly for columns of the Jacobian. If the first $30$ variables occur nonlinearly in the constraints but the remaining variables are nonlinear only in the objective, then $o b j f u n$ must set the first $30$ elements of the array $o b j g r d$ to zero, but these hardly need to be verified. Again it is reasonable to specify $‘Start Objective Check At Column' = 31$ .

If $i_{2} \leq 0$ or $i_{2} > n_{1}^{'}$ , the default value is used.

If $i_{1} \leq 0$ or $i_{1} > m i n (n_{1}^{'}, i_{2})$ , the default value is used.

If $i_{4} \leq 0$ or $i_{4} > n_{1}^{''}$ , the default value is used.

If $i_{3} \leq 0$ or $i_{3} > m i n (n_{1}^{''}, i_{4})$ , the default value is used.

‘Superbasics Limit’int

Default $= m i n (500, ¯ n + 1)$

Note that this option does not apply to linear problems.

It places a limit on the storage allocated for superbasic variables. Ideally, the value of $i$ should be set slightly larger than the ‘number of degrees of freedom’ expected at the solution.

For nonlinear problems, the number of degrees of freedom is often called the ‘number of independent variables’. Normally, the value of $i$ need not be greater than $¯ n + 1$ , but for many problems it may be considerably smaller. (This will save storage if $¯ n$ is very large.)

If $i \leq 0$ , the default value is used.

‘Unbounded Objective’float

Default $= 10^{15}$

These options are intended to detect unboundedness in nonlinear problems. During the linesearch, the objective function $f$ is evaluated at points of the form $x + α p$ , where $x$ and $p$ are fixed and $α$ varies. If $| f |$ exceeds $r_{1}$ or $α$ exceeds $r_{2}$ , the iterations are terminated and the function returns with $e r r n o$ = 3.

If singularities are present, unboundedness in $f (x)$ may manifest itself by a floating-point overflow during the evaluation of $f (x + α p)$ , before the test against $r_{1}$ can be made.

Unboundedness in $x$ is best avoided by placing finite upper and lower bounds on the variables.

If $r_{1} \leq 0$ or $r_{2} \leq 0$ , the appropriate default value is used.

‘Unbounded Step Size’float

Default $= m a x (bigbnd, 10^{20})$

These options are intended to detect unboundedness in nonlinear problems. During the linesearch, the objective function $f$ is evaluated at points of the form $x + α p$ , where $x$ and $p$ are fixed and $α$ varies. If $| f |$ exceeds $r_{1}$ or $α$ exceeds $r_{2}$ , the iterations are terminated and the function returns with $e r r n o$ = 3.

If singularities are present, unboundedness in $f (x)$ may manifest itself by a floating-point overflow during the evaluation of $f (x + α p)$ , before the test against $r_{1}$ can be made.

Unboundedness in $x$ is best avoided by placing finite upper and lower bounds on the variables.

If $r_{1} \leq 0$ or $r_{2} \leq 0$ , the appropriate default value is used.

‘Verify Level’int

Default $= 0$

This option refers to finite difference checks on the gradient elements computed by $o b j f u n$ and $c o n f u n$ . Gradients are verified at the first point that satisfies the linear constraints and the upper and lower bounds. Unspecified gradient elements are not checked and hence they result in no overhead. The possible choices for $i$ are the following.

$i$	Meaning
$- 1$	No checks are performed.
$0$	Only a ‘cheap’ test will be performed, requiring three calls to $o b j f u n$ and two calls to $c o n f u n$ . Note that no checks are carried out if every column of the constraint gradients (Jacobian) contains a missing element.
$1$	Individual objective gradient elements will be checked using a reliable (but more expensive) test. If $‘Major Print Level' > 0$ , a key of the form OK or BAD? indicates whether or not each element appears to be correct. If a gradient element is determined to be extremely poor (i.e., if it appears to have no significant digits of accuracy at all), then `nlp1_sparse_solve` will also exit with an error indicator in argument $errno$ .
$2$	Individual columns of the constraint gradients (Jacobian) will be checked using a reliable (but more expensive) test. If $‘Major Print Level' > 0$ , a key of the form OK or BAD? indicates whether or not each element appears to be correct.
$3$	Check both constraint and objective gradients (in that order) as described above for $i = 2$ and $i = 1$ respectively.

The value $i = 3$ should be used whenever a new function function is being developed. The ‘Start Objective Check At Column’ and ‘Stop Objective Check At Column’ keywords may be used to limit the number of nonlinear variables to be checked.

If $i < - 1$ or $i > 3$ , the default value is used.

‘Violation Limit’float

Default $= 10.0$

This option defines an absolute limit on the magnitude of the maximum constraint violation after the linesearch. Upon completion of the linesearch, the new iterate $x_{k + 1}$ satisfies the condition

v_{i} (x_{k + 1}) \leq r \times m a x (1, v_{i} (x_{0})),

where $x_{0}$ is the point at which the nonlinear constraints are first evaluated and $v_{i} (x)$ is the $i$ th nonlinear constraint violation $v_{i} (x) = m a x (0, l_{i} - F_{i} (x), F_{i} (x) - u_{i})$ .

The effect of the violation limit is to restrict the iterates to lie in an expanded feasible region whose size depends on the magnitude of $r$ . This makes it possible to keep the iterates within a region where the objective function is expected to be well-defined and bounded below (or above in the case of maximization). If the objective function is bounded below (or above in the case of maximization) for all values of the variables, then $r$ may be any large positive value.

If $r \leq 0$ , the default value is used.

Raises

NagValueError

(errno $7$ )

On entry, $s t a r t = ⟨ v a l u e ⟩$ .

Constraint: $s t a r t ='C'$ or $'W'$ .

(errno $7$ )

On entry, $n = ⟨ v a l u e ⟩$ .

Constraint: $n \geq 1$ .

(errno $7$ )

On entry, $m = ⟨ v a l u e ⟩$ .

Constraint: $m \geq 1$ .

(errno $7$ )

On entry, $nnz = ⟨ v a l u e ⟩$ , $n = ⟨ v a l u e ⟩$ and $m = ⟨ v a l u e ⟩$ .

Constraint: $1 \leq nnz \leq n \times m$ .

(errno $7$ )

On entry, $n c n l n = ⟨ v a l u e ⟩$ and $m = ⟨ v a l u e ⟩$ .

Constraint: $0 \leq n c n l n \leq m$ .

(errno $7$ )

On entry, $n o n l n = ⟨ v a l u e ⟩$ and $n = ⟨ v a l u e ⟩$ .

Constraint: $0 \leq n o n l n \leq n$ .

(errno $7$ )

On entry, $n c n l n = 0$ and $n j n l n = ⟨ v a l u e ⟩$ .

Constraint: if $n c n l n = 0$ then $n o n l n = 0$ .

(errno $7$ )

On entry, $n j n l n = ⟨ v a l u e ⟩$ , $n = ⟨ v a l u e ⟩$ and $n c n l n = ⟨ v a l u e ⟩$ .

Constraint: if $n c n l n > 0$ then $1 \leq n j n l n \leq n$ .

(errno $7$ )

On entry, $i o b j = ⟨ v a l u e ⟩$ .

Constraint: $i o b j \geq - 1$ .

(errno $7$ )

On entry, $i o b j = ⟨ v a l u e ⟩$ , $n c n l n = ⟨ v a l u e ⟩$ and $m = ⟨ v a l u e ⟩$ .

Constraint: $n c n l n < i o b j \leq m$ .

(errno $7$ )

On entry, $nname = ⟨ v a l u e ⟩$ , $n = ⟨ v a l u e ⟩$ and $m = ⟨ v a l u e ⟩$ .

Constraint: $nname = 1$ or $nname = n + m$ .

(errno $7$ )

On entry, $l e n i z = ⟨ v a l u e ⟩$ , $n = ⟨ v a l u e ⟩$ and $m = ⟨ v a l u e ⟩$ .

Constraint: $l e n i z \geq m a x (500, m + n)$ .

(errno $7$ )

On entry, $l e n z = ⟨ v a l u e ⟩$ .

Constraint: $l e n z \geq 500$ .

(errno $7$ )

On entry, $m = ⟨ v a l u e ⟩$ and $h a [⟨ v a l u e ⟩] = ⟨ v a l u e ⟩$ .

Constraint: $1 \leq h a [j] \leq m$ for all $j$ .

(errno $7$ )

On entry, $k a [0] = ⟨ v a l u e ⟩$ .

Constraint: $k a [0] = 1$ .

(errno $7$ )

On entry, $n + 1 = ⟨ v a l u e ⟩$ , $k a [n] = ⟨ v a l u e ⟩$ and $nnz + 1 = ⟨ v a l u e ⟩$ .

Constraint: $k a [n] = nnz + 1$ .

(errno $7$ )

On entry, $k a [⟨ v a l u e ⟩] = ⟨ v a l u e ⟩$ .

Constraint: $k a [j] \geq 1$ for all $j$ .

(errno $7$ )

On entry, $m = ⟨ v a l u e ⟩$ , $k a [j] = ⟨ v a l u e ⟩$ and $k a [j - 1] = ⟨ v a l u e ⟩$ , for $j = ⟨ v a l u e ⟩$ .

Constraint: $0 \leq k a [j + 1] - k a [j] \leq m$ for all $j$ .

(errno $7$ )

On entry, duplicate element found in row $⟨ v a l u e ⟩$ , column $⟨ v a l u e ⟩$ .

(errno $7$ )

On entry, $s t a r t ='C'$ and $i s t a t e [⟨ v a l u e ⟩] = ⟨ v a l u e ⟩$ .

Constraint: if $s t a r t ='C'$ then $i s t a t e [i] \geq 0$ for all $i$ .

(errno $7$ )

On entry, $s t a r t ='C'$ and $i s t a t e [⟨ v a l u e ⟩] = ⟨ v a l u e ⟩$ .

Constraint: if $s t a r t ='C'$ then $i s t a t e [i] \leq 5$ for all $i$ .

(errno $7$ )

On entry, $s t a r t ='W'$ and $i s t a t e [⟨ v a l u e ⟩] = ⟨ v a l u e ⟩$ .

Constraint: if $s t a r t ='W'$ then $i s t a t e [i] \geq 0$ for all $i$ .

(errno $7$ )

On entry, $s t a r t ='W'$ and $i s t a t e [⟨ v a l u e ⟩] = ⟨ v a l u e ⟩$ .

Constraint: if $s t a r t ='W'$ then $i s t a t e [i] \leq 3$ for all $i$ .

(errno $7$ )

On entry, $n = ⟨ v a l u e ⟩$ , $i o b j = ⟨ v a l u e ⟩$ , $b l [n + | i o b j | - 1] = ⟨ v a l u e ⟩$ and $bigbnd = ⟨ v a l u e ⟩$ .

Constraint: $b l [n + | i o b j | - 1] \leq - bigbnd$ .

(errno $7$ )

On entry, $n = ⟨ v a l u e ⟩$ , $i o b j = ⟨ v a l u e ⟩$ , $b u [n + | i o b j | - 1] = ⟨ v a l u e ⟩$ and $bigbnd = ⟨ v a l u e ⟩$ .

Constraint: $b u [n + | i o b j | - 1] \geq bigbnd$ .

(errno $7$ )

On entry, the equal bounds on $⟨ v a l u e ⟩$ are infinite, because $b l [⟨ v a l u e ⟩] = beta$ and $b u [⟨ v a l u e ⟩] = beta$ , but $| beta | \geq bigbnd$ : $beta = ⟨ v a l u e ⟩$ and $bigbnd = ⟨ v a l u e ⟩$ .

(errno $7$ )

On entry, the bounds on $⟨ v a l u e ⟩$ are inconsistent: $b l [⟨ v a l u e ⟩] = ⟨ v a l u e ⟩$ and $b u [⟨ v a l u e ⟩] = ⟨ v a l u e ⟩$ .

(errno $8$ )

Function $o b j f u n$ appears to be giving incorrect gradients.

(errno $9$ )

Function $c o n f u n$ appears to be giving incorrect gradients.

(errno $11$ )

Numerical error in trying to satisfy the linear constraints.

(errno $12$ )

Not enough integer workspace for the basis factors.

(errno $13$ )

Not enough real workspace for the basis factors.

Warns

NagAlgorithmicWarning

(errno $- 1$ ): Constraint and objective values could not be calculated.
(errno $i < 0$ ): User requested termination by setting $m o d e$ negative in $o b j f u n$ or $c o n f u n$ .
(errno $1$ ): Infeasible problem, nonlinear infeasibilities minimized.
(errno $1$ ): No feasible point for the nonlinear constraints.
(errno $1$ ): No feasible point for the linear constraints.
(errno $2$ ): The problem is unbounded (or badly scaled).
(errno $3$ ): Violation Limit exceeded. The problem may be unbounded.
(errno $5$ ): Feasible solution, but requested accuracy could not be achieved.
(errno $10$ ): Current point cannot be improved upon.
(errno $14$ ): The basis is singular after $15$ factorization attempts.
(errno $15$ ): Not enough integer workspace to start solving the problem.
(errno $16$ ): Not enough real workspace to start solving the problem.

NagAlgorithmicMajorWarning

(errno $4$ ): Major Iteration Limit exceeded.
(errno $4$ ): Minor Iteration Limit exceeded.
(errno $4$ ): Iteration Limit exceeded.
(errno $6$ ): The value of the option ‘Superbasics Limit’ is too small.

Notes

In the NAG Library the traditional C interface for this routine uses a different algorithmic base. Please contact NAG if you have any questions about compatibility.

nlp1_sparse_solve is designed to solve a class of nonlinear programming problems that are assumed to be stated in the following general form:

\begin{matrix} {m i n i m i z e}_{x \in R^{n}} f (x) subject to l \leq ⎧ ⎪ ⎨ ⎪ ⎩ \begin{matrix} x F (x) G x \end{matrix} ⎫ ⎪ ⎬ ⎪ ⎭ \leq u, \end{matrix}

where $x = {(x_{1}, x_{2}, \dots, x_{n})}_{1}^{T}$ is a set of variables, $f (x)$ is a smooth scalar objective function, $l$ and $u$ are constant lower and upper bounds, $F (x)$ is a vector of smooth nonlinear constraint functions ${F_{i} (x)}$ and $G$ is a sparse matrix.

The constraints involving $F$ and $G x$ are called the general constraints. Note that upper and lower bounds are specified for all variables and constraints. This form allows full generality in specifying various types of constraint. In particular, the $j$ th constraint can be defined as an equality by setting $l_{j} = u_{j}$ . If certain bounds are not present, the associated elements of $l$ or $u$ can be set to special values that will be treated as $- \infty$ or $+ \infty$ . (See the description of the option ‘Infinite Bound Size’.)

nlp1_sparse_solve converts the upper and lower bounds on the $m$ elements of $F$ and $G x$ to equalities by introducing a set of slack variables $s$ , where $s = {(s_{1}, s_{2}, \dots, s_{m})}_{1}^{T}$ . For example, the linear constraint $5 \leq 2 x_{1} + 3 x_{2} \leq + \infty$ is replaced by $2 x_{1} + 3 x_{2} - s_{1} = 0$ , together with the bounded slack $5 \leq s_{1} \leq + \infty$ . The problem defined by (1) can, therefore, be re-written in the following equivalent form:

\begin{matrix} {m i n i m i z e}_{x \in R^{n}, s \in R^{m}} f (x) subject to {\begin{matrix} G x \end{matrix}} - s = 0, l \leq {\begin{matrix} x s \end{matrix}} \leq u . \end{matrix}

Since the slack variables $s$ are subject to the same upper and lower bounds as the elements of $F$ and $G x$ , the bounds on $F$ and $G x$ can simply be thought of as bounds on the combined vector $(x, s)$ . The elements of $x$ and $s$ are partitioned into basic, nonbasic and superbasic variables defined as follows:

a basic variable ( $x_{j}$ say) is the $j$ th variable associated with the $j$ th column of the basis matrix $B$ ;
a nonbasic variable is a variable that is temporarily fixed at its current value (usually its upper or lower bound);
a superbasic variable is a nonbasic variable which is not at one of its bounds that is free to move in any desired direction (namely one that will improve the value of the objective function or reduce the sum of infeasibilities).

For example, in the simplex method (see Gill et al. (1981)) the elements of $x$ can be partitioned at each vertex into a set of $m$ basic variables (all non-negative) and a set of $(n - m)$ nonbasic variables (all zero). This is equivalent to partitioning the columns of the constraint matrix as $(\begin{matrix} B & N \end{matrix})$ , where $B$ contains the $m$ columns that correspond to the basic variables and $N$ contains the $(n - m)$ columns that correspond to the nonbasic variables. Note that $B$ is square and nonsingular.

The option ‘Maximize’ may be used to specify an alternative problem in which $f (x)$ is maximized. If the objective function is nonlinear and all the constraints are linear, $F$ is absent and the problem is said to be linearly constrained. In general, the objective and constraint functions are structured in the sense that they are formed from sums of linear and nonlinear functions. This structure can be exploited by the function during the solution process as follows.

Consider the following nonlinear optimization problem with four variables ( $u, v, z, w$ ):

{m i n i m i z e}_{u, v, z, w} {(u + v + z)}^{2} + 3 z + 5 w

subject to the constraints

\begin{matrix} \begin{matrix} u^{2} + v^{2} + z = 2 u^{4} + v^{4} + w = 4 2 u + 4 v \geq 0 \end{matrix} \end{matrix}

and to the bounds

\begin{matrix} \begin{matrix} z \geq 0 w \geq 0 . \end{matrix} \end{matrix}

This problem has several characteristics that can be exploited by the function:

the objective function is nonlinear. It is the sum of a nonlinear function of the variables ( $u, v, z$ ) and a linear function of the variables ( $z, w$ );
the first two constraints are nonlinear. The third is linear;
each nonlinear constraint function is the sum of a nonlinear function of the variables ( $u, v$ ) and a linear function of the variables ( $z, w$ ).

The nonlinear terms are defined by $o b j f u n$ and $c o n f u n$ (see Parameters), which involve only the appropriate subset of variables.

For the objective, we define the function $f (u, v, z) = {(u + v + z)}^{2}$ to include only the nonlinear part of the objective. The three variables ( $u, v, z$ ) associated with this function are known as the nonlinear objective variables. The number of them is given by $n o n l n$ (see Parameters) and they are the only variables needed in $o b j f u n$ . The linear part $3 z + 5 w$ of the objective is stored in row $i o b j$ (see Parameters) of the (constraint) Jacobian matrix $A$ (see below).

Thus, if $x^{'}$ and $y^{'}$ denote the nonlinear and linear objective variables, respectively, the objective may be re-written in the form

f (x^{'}) + c^{T} x^{'} + d^{T} y^{'},

where $f (x^{'})$ is the nonlinear part of the objective and $c$ and $d$ are constant vectors that form a row of $A$ . In this example, $x^{'} = (u, v, z)$ and $y^{'} = w$ .

Similarly for the constraints, we define a vector function $F (u, v)$ to include just the nonlinear terms. In this example, $F_{1} (u, v) = u^{2} + v^{2}$ and $F_{2} (u, v) = u^{4} + v^{4}$ , where the two variables ( $u, v$ ) are known as the nonlinear Jacobian variables. The number of them is given by $n j n l n$ (see Parameters) and they are the only variables needed in $c o n f u n$ . Thus, if $x^{''}$ and $y^{''}$ denote the nonlinear and linear Jacobian variables, respectively, the constraint functions and the linear part of the objective have the form

\begin{matrix} (\begin{matrix} F (x^{''}) + A_{2} y^{''} A_{3} x^{''} + A_{4} y^{''} \end{matrix}), \end{matrix}

where $x^{''} = (u, v)$ and $y^{''} = (z, w)$ in this example. This ensures that the Jacobian is of the form

\begin{matrix} A = (\begin{matrix} J (x^{''}) & A_{2} A_{3} & A_{4} \end{matrix}), \end{matrix}

where $J (x^{''}) = \frac{\partial F (x^{''})}{\partial x}$ . Note that $J (x^{''})$ always appears in the top left-hand corner of $A$ .

The inequalities $l_{1} \leq F (x^{''}) + A_{2} y^{''} \leq u_{1}$ and $l_{2} \leq A_{3} x^{''} + A_{4} y^{''} \leq u_{2}$ implied by the constraint functions in (3) are known as the nonlinear and linear constraints, respectively. The nonlinear constraint vector $F (x^{''})$ in (3) and (optionally) its partial derivative matrix $J (x^{''})$ are set in $c o n f u n$ . The matrices $A_{2}$ , $A_{3}$ and $A_{4}$ contain any (constant) linear terms. Along with the sparsity pattern of $J (x^{''})$ they are stored in the arrays $a$ , $h a$ and $k a$ (see Parameters).

In general, the vectors $x^{'}$ and $x^{''}$ have different dimensions, but they always overlap, in the sense that the shorter vector is always the beginning of the other. In the above example, the nonlinear Jacobian variables $(u, v)$ are an ordered subset of the nonlinear objective variables $(u, v, z)$ . In other cases it could be the other way round (whichever is the most convenient), but the first way keeps $J (x^{''})$ as small as possible.

Note that the nonlinear objective function $f (x^{'})$ may involve either a subset or superset of the variables appearing in the nonlinear constraint functions $F (x^{''})$ . Thus, $n o n l n \leq n j n l n$ (or vice-versa). Sometimes the objective and constraints really involve disjoint sets of nonlinear variables. In such cases the variables should be ordered so that $n o n l n > n j n l n$ and $x^{'} = (x^{''}, x^{'''})$ , where the objective is nonlinear in just the last vector $x^{'''}$ . The first $n j n l n$ elements of the gradient array $o b j g r d$ should also be set to zero in $o b j f u n$ .

If all elements of the constraint Jacobian are known (i.e., the option $‘Derivative Level' = 2$ or $3$ ), any constant elements may be assigned their correct values in $a$ , $h a$ and $k a$ . The corresponding elements of the constraint Jacobian array $f j a c$ need not be reset in $c o n f u n$ . This includes values that are identically zero as constraint Jacobian elements are assumed to be zero unless specified otherwise. It must be emphasized that, if $‘Derivative Level' = 0$ or $1$ , unassigned elements of $f j a c$ are not treated as constant; they are estimated by finite differences, at nontrivial expense.

If there are no nonlinear constraints in (1) and $f (x)$ is linear or quadratic, then it may be more efficient to use qpconvex2_sparse_solve() to solve the resulting linear or quadratic programming problem, or one of lp_solve(), lsq_lincon_solve() or qp_dense_solve() if $G$ is a dense matrix. If the problem is dense and does have nonlinear constraints then one of nlp2_solve(), nlp1_rcomm() or lsq_gencon_deriv() (as appropriate) should be used instead.

You must supply an initial estimate of the solution to (1), together with versions of $o b j f u n$ and $c o n f u n$ that define $f (x^{'})$ and $F (x^{''})$ , respectively, and as many first partial derivatives as possible. Note that if there are any nonlinear constraints, then the first call to $c o n f u n$ will precede the first call to $o b j f u n$ .

nlp1_sparse_solve is based on the SNOPT package described in Gill et al. (2002), which in turn utilizes functions from the MINOS package (see Murtagh and Saunders (1995)). It incorporates a Sequential Quadratic Programming (SQP) method that obtains search directions from a sequence of Quadratic Programming (QP) subproblems. Each QP subproblem minimizes a quadratic model of a certain Lagrangian function subject to a linearization of the constraints. An augmented Lagrangian merit function is reduced along each search direction to ensure convergence from any starting point. Further details can be found in Algorithmic Details.

Throughout this document the symbol $ϵ$ is used to represent the machine precision (see machine.precision).

References

Conn, A R, 1973, Constrained optimization using a nondifferentiable penalty function, SIAM J. Numer. Anal. (10), 760–779

Eldersveld, S K, 1991, Large-scale sequential quadratic programming algorithms, PhD Thesis, Department of Operations Research, Stanford University, Stanford

Fletcher, R, 1984, An $l_{1}$ penalty method for nonlinear constraints, Numerical Optimization 1984, (eds P T Boggs, R H Byrd and R B Schnabel), 26–40, SIAM Philadelphia

Fourer, R, 1982, Solving staircase linear programs by the simplex method, Math. Programming (23), 274–313

Gill, P E, Murray, W and Saunders, M A, 2002, SNOPT: An SQP Algorithm for Large-scale Constrained Optimization (12), 979–1006, SIAM J. Optim.

Gill, P E, Murray, W, Saunders, M A and Wright, M H, 1986, Users’ guide for NPSOL (Version 4.0): a Fortran package for nonlinear programming, Report SOL 86-2, Department of Operations Research, Stanford University

Gill, P E, Murray, W, Saunders, M A and Wright, M H, 1989, A practical anti-cycling procedure for linearly constrained optimization, Math. Programming (45), 437–474

Gill, P E, Murray, W, Saunders, M A and Wright, M H, 1992, Some theoretical properties of an augmented Lagrangian merit function, Advances in Optimization and Parallel Computing, (ed P M Pardalos), 101–128, North Holland

Gill, P E, Murray, W and Wright, M H, 1981, Practical Optimization, Academic Press

Hock, W and Schittkowski, K, 1981, Test Examples for Nonlinear Programming Codes. Lecture Notes in Economics and Mathematical Systems (187), Springer–Verlag

Murtagh, B A and Saunders, M A, 1995, MINOS 5.4 users’ guide, Report SOL 83-20R, Department of Operations Research, Stanford University

Ortega, J M and Rheinboldt, W C, 1970, Iterative Solution of Nonlinear Equations in Several Variables, Academic Press

Powell, M J D, 1974, Introduction to constrained optimization, Numerical Methods for Constrained Optimization, (eds P E Gill and W Murray), 1–28, Academic Press

NAG and Python

Return to Front

naginterfaces.library.opt.nlp1_sparse_solve¶

naginterfaces.library.opt.nlp1_​sparse_​solve¶

naginterfaces.library.opt.nlp1_sparse_solve¶