e04gn: FL CL CPP AD PY MB

NAG FL Interface
e04gnf (handle_solve_nldf)

Note: this routine uses optional parameters to define choices in the problem specification and in the details of the algorithm. If you wish to use default settings for all of the optional parameters, you need only read Sections 1 to 10 of this document. If, however, you wish to reset some or all of the settings please refer to Section 11 for a detailed description of the algorithm and to Section 12 for a detailed description of the specification of the optional parameters.

Keyword Search:

NAG Library Manual, Mark 28.6

Interfaces: FL CL CPP AD PY MB

NAG FL Interface Introduction

E04 (Opt) Chapter Contents

E04 (Opt) Chapter Introduction

e04gn: FL CL CPP AD PY MB

▸▿ Contents

1 Purpose

2 Specification

3 Description

4 References

5 Arguments

6 Error Indicators and Warnings

7 Accuracy

8 Parallelism and Performance

▸▿ 9 Further Comments

9.1 Description of the Printed Output

▸▿ 10 Example

10.1 Program Text

10.2 Program Data

10.3 Program Results

▸▿ 11 Algorithmic Details

11.1 Loss Function and Regularization Types

11.2 Optimization Problem Transformations

▸▿ 12 Optional Parameters

12.1 Description of the Optional Parameters

1 Purpose

e04gnf is a solver from the NAG optimization modelling suite for general nonlinear data-fitting problems with constraints. Various loss and regularization functions are supported.

2 Specification

Fortran Interface

Subroutine e04gnf (

handle, lsqfun, lsqgrd, confun, congrd, monit, nvar, x, nres, rx, rinfo, stats, iuser, ruser, cpuser, ifail)

Integer, Intent (In)	::	nvar, nres
Integer, Intent (Inout)	::	iuser(*), ifail
Real (Kind=nag_wp), Intent (Inout)	::	x(nvar), ruser(*)
Real (Kind=nag_wp), Intent (Out)	::	rx(nres), rinfo(100), stats(100)
Type (c_ptr), Intent (In)	::	handle, cpuser
External	::	lsqfun, lsqgrd, confun, congrd, monit

C Header Interface

#include <nag.h>

void

e04gnf_ (void **handle,
void (NAG_CALL *lsqfun)(const Integer *nvar, const double x[], const Integer *nres, double rx[], Integer *inform, Integer iuser[], double ruser[], void **cpuser),
void (NAG_CALL *lsqgrd)(const Integer *nvar, const double x[], const Integer *nres, const Integer *nnzrd, double rdx[], Integer *inform, Integer iuser[], double ruser[], void **cpuser),
void (NAG_CALL *confun)(const Integer *nvar, const double x[], const Integer *ncnln, double gx[], Integer *inform, Integer iuser[], double ruser[], void **cpuser),
void (NAG_CALL *congrd)(const Integer *nvar, const double x[], const Integer *nnzgd, double gdx[], Integer *inform, Integer iuser[], double ruser[], void **cpuser),
void (NAG_CALL *monit)(const Integer *nvar, const double x[], Integer *inform, const double rinfo[], const double stats[], Integer iuser[], double ruser[], void **cpuser),
const Integer *nvar, double x[], const Integer *nres, double rx[], double rinfo[], double stats[], Integer iuser[], double ruser[], void **cpuser, Integer *ifail)

The routine may be called by the names e04gnf or nagf_opt_handle_solve_nldf.

3 Description

e04gnf solves a data-fitting problem of the form

\begin{array}{l} \underset{x \in ℝ^{n_{var}}}{minimize} & f (x) = \sum_{i = 1}^{n_{res}} χ (r_{i} (x)) + ρ \sum_{i = 1}^{n_{var}} ψ (x_{i}) \\ subject to & l_{g} \leq g (x) \leq u_{g}, \\ \frac{1}{2} x^{T} Q_{i} x + p_{i}^{T} x + s_{i} \leq 0, 1 \leq i \leq m_{Q}, \\ l_{B} \leq B x \leq u_{B}, \\ l_{x} \leq x \leq u_{x}, \end{array}

(1)

where

$n_{var}$ is the number of decision variables,
$m_{g}$ is the number of the nonlinear constraints and $g (x)$ , $l_{g}$ and $u_{g}$ are $m_{g}$ -dimensional vectors,
$m_{Q}$ is the number of quadratic constraints,
$m_{B}$ is the number of the linear constraints and $B$ is a $m_{B} \times n_{var}$ matrix, $l_{B}$ and $u_{B}$ are $m_{B}$ -dimensional vectors,
there are $n_{var}$ box constraints and $l_{x}$ and $u_{x}$ are $n_{var}$ -dimensional vectors.

Here,

$x$ is an $n_{var}$ -dimensional vector representing the model parameters,
$χ$ is the loss function,
$ψ$ is the regularization function,
$ρ$ is the regularization coefficient,
$n_{res}$ is the number of residuals and $r_{i}$ is the $i$ th residual, which is defined as

$r_{i} (x) = y_{i} - ϕ (t_{i}; x), i = 1, \dots, n_{res}$

where

ϕ (t_{i}; x)

is the predicted value of the

i

th data point, given

x

. For the

i

th data point,

y_{i}

and

t_{i}

are the observed values of the independent and dependant variables respectively.

The available loss and regularization function types are summarized in Table 1, where

d

is the function parameter and

I (L)

denotes an indicator function taking the value

1

if the logical expression

L

is true and

0

otherwise. Loss function and regularization types can be specified by optional parameters NLDF Loss Function Type and Reg Term Type, respectively. For example, set

NLDF Loss Function Type = LINF

and

Reg Term Type = L2

to use

l_{\infty}

-norm loss function with

l_{2}

-norm (Ridge) regularization. See Section 11 for more details on the loss functions.

**Table 1**
Choices for the loss and regularization function types.
Loss function	$χ (r_{i})$	NLDF Loss Function Type
$l_{2}$ -norm	$r_{i}^{2}$	$L2$
$l_{1}$ -norm	$\| r_{i} \|$	$L1$
$l_{\infty}$ -norm	$\max_{1 \leq j \leq n_{res}} \| r_{j} \| / n_{res}$	$LINF$
Huber (see (7))	${\begin{cases} 0.5 * r_{i}^{2} & if \| r_{i} \| < d \\ d * (\| r_{i} \| - 0.5 * d) & otherwise \end{cases}$	$HUBER$
Cauchy (see (4))	$\ln (1 + {(r_{i} / d)}^{2})$	$CAUCHY$
Atan	$\arctan (r_{i}^{2})$	$ATAN$
SmoothL1 (see (8))	${\begin{cases} 0.5 * r_{i}^{2} / d & if \| r_{i} \| < d \\ \| r_{i} \| - 0.5 * d & otherwise \end{cases}$	$SMOOTHL1$
Quantile (see (9))	$r_{i} * (d - I_{(r_{i} < 0)})$	$QUANTILE$
Regularization	$ψ (x_{i})$	Reg Term Type
Lasso ( $l_{1}$ -norm)	$\| x_{i} \|$	$L1$
Ridge ( $l_{2}$ -norm)	$x_{i}^{2}$	$L2$

e04gnf serves as a solver for problems stored as a handle. The handle points to an internal data structure which defines the problem and serves as a means of communication for routines in the NAG optimization modelling suite. After the handle has been initialized (e.g., e04raf has been called), e04rmf can be used to add a model and define its residual sparsity structure. e04rsf and e04rtf may be used to set or modify quadratic constraints. Linear constraints

l_{B}

B

u_{B}

are handled by e04rjf. Variable box bounds

l_{x}

and

u_{x}

can be specified with e04rhf, and e04rkf can set or modify nonlinear constraints. Once the problem is fully described, the handle may be passed to the solver e04gnf. When the handle is no longer needed, e04rzf should be called to destroy it and deallocate the memory held within. See Section 3.1 in the E04 Chapter Introduction for more details about the NAG optimization modelling suite.

Nonlinear Programming (NLP) solvers e04kff and e04stf are used as solver engines by e04gnf, which defines the selected loss function and regularization, then transforms the problem into standard form that the NLP solvers allow. For best performance, when the objective function

f (x)

is differentiable and without any constraint other than simple bound constraints, e04kff is used. For non-differentiable objective functions or cases where constraints other than simple variable bounds are present, e04stf is used. See Section 11 in e04kff and e04stf for more details on algorithmic details.

The algorithm behaviour can be modified by various optional parameters (see Section 12) which can be set by e04zmf and e04zpf anytime between the initialization of the handle by e.g., e04raf and a call to the solver. Once the solver has finished, options may be modified for the next solve. The solver may be called repeatedly with various starting points and/or optional parameters. Option getter e04znf can be called to retrieve the current value of any option.

4 References

None.

5 Arguments

1: $handle$ – Type (c_ptr) Input

On entry: the handle to the problem. It needs to be initialized (e.g., by e04raf) and to hold a problem formulation compatible with e04gnf. It must not be changed between calls to the NAG optimization modelling suite.

2: $lsqfun$ – Subroutine, supplied by the user. External Procedure

lsqfun must evaluate the value of the nonlinear residuals,

r_{i} (x) ≔ y_{i} - ϕ (t_{i}; x), i = 1, \dots, n_{res}

, at a specified point

x

The specification of lsqfun is:

Fortran Interface

Subroutine lsqfun (

nvar, x, nres, rx, inform, iuser, ruser, cpuser)

Integer, Intent (In)	::	nvar, nres
Integer, Intent (Inout)	::	inform, iuser(*)
Real (Kind=nag_wp), Intent (In)	::	x(nvar)
Real (Kind=nag_wp), Intent (Inout)	::	ruser(*)
Real (Kind=nag_wp), Intent (Out)	::	rx(nres)
Type (c_ptr), Intent (In)	::	cpuser

C Header Interface

void	lsqfun (const Integer nvar, const double x[], const Integer nres, double rx[], Integer inform, Integer iuser[], double ruser[], void *cpuser)

1: $nvar$ – Integer Input: On entry: $n_{var}$ , the current number of decision variables, $x$ , in the model.
2: $x (nvar)$ – Real (Kind=nag_wp) array Input: On entry: $x$ , the vector of variable values at which the residuals, $r_{i}$ , are to be evaluated.
3: $nres$ – Integer Input: On entry: $n_{res}$ , the current number of residuals in the model.
4: $rx (nres)$ – Real (Kind=nag_wp) array Output: On exit: the value of the residual vector, $r (x)$ , evaluated at $x$ .
5: $inform$ – Integer Input/Output: On entry: a non-negative value.

On exit: may be used to indicate that some residuals could not be computed at the requested point. This can be done by setting inform to a negative value. The solver will attempt a rescue procedure and request an alternative point. If the rescue procedure fails, the solver will exit with $ifail = 25$ .
6: $iuser (*)$ – Integer array User Workspace
7: $ruser (*)$ – Real (Kind=nag_wp) array User Workspace
8: $cpuser$ – Type (c_ptr) User Workspace: lsqfun is called with the arguments iuser, ruser and cpuser as supplied to e04gnf. You should use the arrays iuser and ruser, and the data handle cpuser to supply information to lsqfun.

lsqfun must either be a module subprogram USEd by, or declared as EXTERNAL in, the (sub)program from which e04gnf is called. Arguments denoted as Input must not be changed by this procedure.

3: $lsqgrd$ – Subroutine, supplied by the user. External Procedure

lsqgrd evaluates the residual gradients,

\nabla r_{i} (x)

, at a specified point

x

The specification of lsqgrd is:

Fortran Interface

Subroutine lsqgrd (

nvar, x, nres, nnzrd, rdx, inform, iuser, ruser, cpuser)

Integer, Intent (In)	::	nvar, nres, nnzrd
Integer, Intent (Inout)	::	inform, iuser(*)
Real (Kind=nag_wp), Intent (In)	::	x(nvar)
Real (Kind=nag_wp), Intent (Inout)	::	rdx(nnzrd), ruser(*)
Type (c_ptr), Intent (In)	::	cpuser

C Header Interface

void	lsqgrd (const Integer nvar, const double x[], const Integer nres, const Integer nnzrd, double rdx[], Integer inform, Integer iuser[], double ruser[], void **cpuser)

1: $nvar$ – Integer Input

On entry:

n_{var}

, the current number of decision variables,

x

, in the model.

2: $x (nvar)$ – Real (Kind=nag_wp) array Input

On entry:

x

, the vector of variable values at which the residual gradients,

\nabla r_{i} (x)

, are to be evaluated.

3: $nres$ – Integer Input

On entry:

n_{res}

, the current number of residuals in the model.

4: $nnzrd$ – Integer Input

On entry: the number of nonzeros in the first derivative matrix. If

isparse = 0

in the call to e04rmf (recommended use for e04gnf) then

nnzrd = nvar * nres

5: $rdx (nnzrd)$ – Real (Kind=nag_wp) array Input/Output

On entry: the elements should only be assigned and not referenced.

On exit: the vector containing the nonzero residual gradients evaluated at

x

\nabla r (x) = [\nabla r_{1} (x), \nabla r_{2} (x), \dots, \nabla r_{n_{res}} (x)],

where

\nabla r_{i} (x) = {[\frac{\partial r_{i} (x)}{\partial x_{1}}, \dots, \frac{\partial r_{i} (x)}{\partial x_{n_{var}}}]}^{T} .

The elements must be stored in the same order as the defined sparsity pattern provided in the call to e04rmf.

6: $inform$ – Integer Input/Output

On entry: a non-negative value.

On exit: may be used to indicate that the residual gradients could not be computed at the requested point. This can be done by setting inform to a negative value. The solver will attempt a rescue procedure and request an alternative point. If the rescue procedure fails, the solver will exit with

ifail = 25

7: $iuser (*)$ – Integer array User Workspace

8: $ruser (*)$ – Real (Kind=nag_wp) array User Workspace

9: $cpuser$ – Type (c_ptr) User Workspace

lsqgrd is called with the arguments iuser, ruser and cpuser as supplied to e04gnf. You should use the arrays iuser and ruser, and the data handle cpuser to supply information to lsqgrd.

lsqgrd must either be a module subprogram USEd by, or declared as EXTERNAL in, the (sub)program from which e04gnf is called. Arguments denoted as Input must not be changed by this procedure.

4: $confun$ – Subroutine, supplied by the NAG Library or the user. External Procedure

confun must calculate the values of the

m_{g}

-element vector

g_{i} (x)

of nonlinear constraint functions at a specified value of the

n_{var}

-element vector of

x

variables. If there are no nonlinear constraints then confun will never be called by e04gnf and it may be the dummy subroutine e04gnx included in the NAG Library.

The specification of confun is:

Fortran Interface

Subroutine confun (

nvar, x, ncnln, gx, inform, iuser, ruser, cpuser)

Integer, Intent (In)	::	nvar, ncnln
Integer, Intent (Inout)	::	inform, iuser(*)
Real (Kind=nag_wp), Intent (In)	::	x(nvar)
Real (Kind=nag_wp), Intent (Inout)	::	ruser(*)
Real (Kind=nag_wp), Intent (Out)	::	gx(ncnln)
Type (c_ptr), Intent (In)	::	cpuser

C Header Interface

void	confun (const Integer nvar, const double x[], const Integer ncnln, double gx[], Integer inform, Integer iuser[], double ruser[], void *cpuser)

1: $nvar$ – Integer Input: On entry: $n_{var}$ , the current number of decision variables, $x$ , in the model.
2: $x (nvar)$ – Real (Kind=nag_wp) array Input: On entry: the vector $x$ of variable values at which the constraint functions are to be evaluated.
3: $ncnln$ – Integer Input: On entry: $m_{g}$ , the number of nonlinear constraints, as specified in an earlier call to e04rkf.
4: $gx (ncnln)$ – Real (Kind=nag_wp) array Output: On exit: the $m_{g}$ values of the nonlinear constraint functions at $x$ .
5: $inform$ – Integer Input/Output: On entry: a non-negative value.

On exit: must be set to a value describing the action to be taken by the solver on return from confun. Specifically, if the value is negative, then the value of gx will be discarded and the solver will either attempt to find a different trial point or terminate immediately with $ifail = 25$ ; otherwise, the solver will proceed normally.
6: $iuser (*)$ – Integer array User Workspace
7: $ruser (*)$ – Real (Kind=nag_wp) array User Workspace
8: $cpuser$ – Type (c_ptr) User Workspace: confun is called with the arguments iuser, ruser and cpuser as supplied to e04gnf. You should use the arrays iuser and ruser, and the data handle cpuser to supply information to confun.

confun must either be a module subprogram USEd by, or declared as EXTERNAL in, the (sub)program from which e04gnf is called. Arguments denoted as Input must not be changed by this procedure.

Note: confun should not return floating-point NaN (Not a Number) or infinity values, since these are not handled by e04gnf. If your code inadvertently does return any NaNs or infinities, e04gnf is likely to produce unexpected results.

5: $congrd$ – Subroutine, supplied by the NAG Library or the user. External Procedure

congrd must calculate the nonzero values of the sparse Jacobian of the nonlinear constraint functions,

\frac{\partial g_{i}}{\partial x}

, at a specified value of the

n_{var}

-element vector of

x

variables. If there are no nonlinear constraints, congrd will never be called by e04gnf and congrd may be the dummy subroutine e04gny included in the NAG Library.)

The specification of congrd is:

Fortran Interface

Subroutine congrd (

nvar, x, nnzgd, gdx, inform, iuser, ruser, cpuser)

Integer, Intent (In)	::	nvar, nnzgd
Integer, Intent (Inout)	::	inform, iuser(*)
Real (Kind=nag_wp), Intent (In)	::	x(nvar)
Real (Kind=nag_wp), Intent (Inout)	::	gdx(nnzgd), ruser(*)
Type (c_ptr), Intent (In)	::	cpuser

C Header Interface

void	congrd (const Integer nvar, const double x[], const Integer nnzgd, double gdx[], Integer inform, Integer iuser[], double ruser[], void *cpuser)

1: $nvar$ – Integer Input: On entry: $n_{var}$ , the current number of decision variables, $x$ , in the model.
2: $x (nvar)$ – Real (Kind=nag_wp) array Input: On entry: the vector $x$ of variable values at which the Jacobian of the constraint functions is to be evaluated.
3: $nnzgd$ – Integer Input: On entry: is the number of nonzero elements in the sparse Jacobian of the constraint functions, as was set in a previous call to e04rkf.
4: $gdx (nnzgd)$ – Real (Kind=nag_wp) array Input/Output: On entry: the elements should only be assigned and not referenced.

On exit: the nonzero values of the Jacobian of the nonlinear constraints, in the order specified by irowgd and icolgd in an earlier call to e04rkf. $gdx (i)$ will be the gradient $\frac{\partial g_{j}}{\partial x_{k}}$ , where $j = irowgd (i)$ and $k = icolgd (i)$ .
5: $inform$ – Integer Input/Output: On entry: a non-negative value.

On exit: must be set to a value describing the action to be taken by the solver on return from congrd. Specifically, if the value is negative the solution of the current problem will terminate immediately with $ifail = 25$ ; otherwise, computations will continue.
6: $iuser (*)$ – Integer array User Workspace
7: $ruser (*)$ – Real (Kind=nag_wp) array User Workspace
8: $cpuser$ – Type (c_ptr) User Workspace: congrd is called with the arguments iuser, ruser and cpuser as supplied to e04gnf. You should use the arrays iuser and ruser, and the data handle cpuser to supply information to congrd.

congrd must either be a module subprogram USEd by, or declared as EXTERNAL in, the (sub)program from which e04gnf is called. Arguments denoted as Input must not be changed by this procedure.

Note: congrd should not return floating-point NaN (Not a Number) or infinity values, since these are not handled by e04gnf. If your code inadvertently does return any NaNs or infinities, e04gnf is likely to produce unexpected results.

6: $monit$ – Subroutine, supplied by the NAG Library or the user. External Procedure

monit is provided to enable monitoring of the progress of the optimization and, if necessary, to halt the optimization process.

If no monitoring is required, monit may be the dummy subroutine e04gnu supplied in the NAG Library.

monit is called at the end of every

i

th step where

i

is controlled by the optional parameter NLDF Monitor Frequency (if the value is

0

, monit is not called).

The specification of monit is:

Fortran Interface

Subroutine monit (

nvar, x, inform, rinfo, stats, iuser, ruser, cpuser)

Integer, Intent (In)	::	nvar
Integer, Intent (Inout)	::	inform, iuser(*)
Real (Kind=nag_wp), Intent (In)	::	x(nvar), rinfo(100), stats(100)
Real (Kind=nag_wp), Intent (Inout)	::	ruser(*)
Type (c_ptr), Intent (In)	::	cpuser

C Header Interface

void	monit (const Integer nvar, const double x[], Integer inform, const double rinfo[], const double stats[], Integer iuser[], double ruser[], void **cpuser)

1: $nvar$ – Integer Input: On entry: $n_{var}$ , the current number of decision variables, $x$ , in the model.
2: $x (nvar)$ – Real (Kind=nag_wp) array Input: On entry: the current best point.
3: $inform$ – Integer Input/Output: On entry: a non-negative value.

On exit: may be used to request the solver to stop immediately by setting inform to a non-zero value in which case it will terminate with $ifail = 20$ ; otherwise, the solver will proceed normally.
4: $rinfo (100)$ – Real (Kind=nag_wp) array Input: On entry: best objective value computed and various indicators (the values are as described in the main argument rinfo).
5: $stats (100)$ – Real (Kind=nag_wp) array Input: On entry: solver statistics at monitoring steps or at the end of the current iteration (the values are as described in the main argument stats).
6: $iuser (*)$ – Integer array User Workspace
7: $ruser (*)$ – Real (Kind=nag_wp) array User Workspace
8: $cpuser$ – Type (c_ptr) User Workspace: monit is called with the arguments iuser, ruser and cpuser as supplied to e04gnf. You should use the arrays iuser and ruser, and the data handle cpuser to supply information to monit.

monit must either be a module subprogram USEd by, or declared as EXTERNAL in, the (sub)program from which e04gnf is called. Arguments denoted as Input must not be changed by this procedure.

7: $nvar$ – Integer Input

On entry:

n_{var}

, the current number of decision variables,

x

, in the model.

8: $x (nvar)$ – Real (Kind=nag_wp) array Input/Output

On entry:

x_{0}

, the initial estimates of the variables,

x

On exit: the final values of the variables,

x

9: $nres$ – Integer Input

On entry:

n_{res}

, the current number of residuals in the model.

10: $rx (nres)$ – Real (Kind=nag_wp) array Output

On exit: the values of the residuals at the final point given in x.

11: $rinfo (100)$ – Real (Kind=nag_wp) array Output

On exit: objective value and various indicators at monitoring steps or at the end of the final iteration. The measures are:

$1$	Objective function value, $f (x)$ in (1).
$2$	Loss function value, $\sum_{i = 1}^{n_{res}} χ (r_{i} (x))$ in (1).
$3$	Regularization term value, $\sum_{i = 1}^{n_{var}} ψ (x_{i})$ in (1).
$4$	Solution optimality measure.
$5 - 100$	Reserved for future use.

12: $stats (100)$ – Real (Kind=nag_wp) array Output

On exit: solver statistics at monitoring steps or at the end of the final iteration:

$1$	Number of iterations performed.
$2$	Total time in seconds spent in the solver. It includes time spent in user-supplied subroutines.
$3 - 100$	Reserved for future use.

13: $iuser (*)$ – Integer array User Workspace

14: $ruser (*)$ – Real (Kind=nag_wp) array User Workspace

15: $cpuser$ – Type (c_ptr) User Workspace

iuser, ruser and cpuser are not used by e04gnf, but are passed directly to lsqfun, lsqgrd, confun, congrd and monit and may be used to pass information to these routines. If you do not need to reference cpuser, it should be initialized to c_null_ptr.

16: $ifail$ – Integer Input/Output

On entry: ifail must be set to

0

−1

1

to set behaviour on detection of an error; these values have no effect when no error is detected.

A value of

0

causes the printing of an error message and program execution will be halted; otherwise program execution continues. A value of

−1

means that an error message is printed while a value of

1

means that it is not.

If halting is not appropriate, the value

−1

1

is recommended. If message printing is undesirable, then the value

1

is recommended. Otherwise, the value

−1

is recommended since useful values can be provided in some output arguments even when

ifail \neq 0

on exit. When the value $- 1$ or $1$ is used it is essential to test the value of ifail on exit.

On exit:

ifail = 0

unless the routine detects an error or a warning has been flagged (see Section 6).

6 Error Indicators and Warnings

If on entry

ifail = 0

−1

, explanatory error messages are output on the current error message unit (as defined by x04aaf).

Errors or warnings detected by the routine:

Note: in some cases e04gnf may return useful information.

$ifail = 1$: The supplied handle does not define a valid handle to the data structure for the NAG optimization modelling suite. It has not been properly initialized or it has been corrupted.

$ifail = 2$: The problem is already being solved.

This solver does not support the model defined in the handle.

$ifail = 4$: On entry, $nres = ⟨ value ⟩$ , expected $value = ⟨ value ⟩$ .
Constraint: nres must match the current number of residuals defined in the handle.

On entry, $nvar = ⟨ value ⟩$ , expected $value = ⟨ value ⟩$ .
Constraint: nvar must match the current number of variables of the model in the handle.

$ifail = 7$: The dummy confun routine was called but the problem requires these values. Please provide a proper confun routine.

The dummy congrd routine was called but the problem requires these derivatives. Please provide a proper congrd routine.

$ifail = 20$: User requested termination during a monitoring step. inform was set to a negative value in monit.

$ifail = 21$: The current starting point is unusable.
Either inform was set to a negative value within the user-supplied functions lsqfun, lsqgrd, or an Infinity or NaN was detected in values returned from them.

$ifail = 22$: Maximum number of iterations reached.

$ifail = 23$: The solver terminated after the maximum time allowed was exhausted.
Maximum number of seconds exceeded. Use optional parameter Time Limit to change the limit.

$ifail = 24$: The solver was terminated because no further progress could be achieved.
This can indicate that the solver is calculating very small step sizes and is making very little progress. It could also indicate that the problem has been solved to the best numerical accuracy possible given the current scaling.

$ifail = 25$: Invalid number detected in user function. Either inform was set to a negative value within the user-supplied functions lsqfun, lsqgrd, confun, congrd, or an Infinity or NaN was detected in values returned from them.

$ifail = 28$: The solver terminated after an error in the step computation. This message is printed if the solver is unable to compute a search direction, despite several attempts to modify the iteration matrix.

The solver terminated after failure during line search. This could happen if the transformed problem is highly degenerate, does not satisfy the constraint qualification, or if the user-supplied code provides incorrect derivative information.

The solver terminated with not enough degrees of freedom. This indicates that the problem, as specified, has too few degrees of freedom. This can happen if there are too many equality constraints, or if there are too many fixed variables.

$ifail = 50$: Problem was solved to an acceptable level; full accuracy was not achieved.
This indicates that the algorithm detected a sequence of very small reductions in the objective function value and is unable to reach a point satisfying the requested optimality tolerance. This may happen if the desired tolerances are too small for the current problem, or if the input data is badly scaled.

$ifail = 51$: The solver detected an infeasible problem. This indicates that the problem may be infeasible or at least that the algorithm is stuck at a locally infeasible point. If you believe that the problem is feasible, it might help to start the solver from a different point.

$ifail = - 199$: e04gnf is not available in this implementation.

$ifail = - 99$: An unexpected error has been triggered by this routine. Please contact NAG.
See Section 7 in the Introduction to the NAG Library FL Interface for further information.

$ifail = - 399$: Your licence key may have expired or may not have been installed correctly.
See Section 8 in the Introduction to the NAG Library FL Interface for further information.

$ifail = - 999$: Dynamic memory allocation failed.
See Section 9 in the Introduction to the NAG Library FL Interface for further information.

7 Accuracy

The accuracy of the solution is determined by optional parameters NLDF Stop Tolerance.

In the case where both loss function and regularization are differentiable, and with only simple bound constraints, if

ifail = 0

on exit, the returned point satisfies the first-order optimality conditions defined by (6) and (7) in e04kff to the requested accuracy. If the loss or regularization functions are non-differentiable, or the model defines linear, quadratic or general nonlinear constraint, the model is transformed and solved by e04stf. In this case, if

ifail = 0

on exit, the returned point satisfies the Karush–Kuhn–Tucker (KKT) condition defined by (10) in e04stf to the requested accuracy.

8 Parallelism and Performance

Background information to multithreading can be found in the Multithreading documentation.

e04gnf is not threaded in any implementation.

9 Further Comments

9.1 Description of the Printed Output

The solver can print information to give an overview of the problem and the progress of the computation. The output may be sent to two independent unit numbers which are set by optional parameters Print File and Monitoring File. Optional parameters Print Level, Print Options, Monitoring Level and Print Solution determine the exposed level of detail. This allows, for example, a detailed log file to be generated while the condensed information is displayed on the screen.

By default (

Print File = 6

Print Level = 2

), the following sections are printed to the standard output:

Header
Optional parameters list (if $Print Options = YES$ )
Problem statistics
Iteration log
Summary
Solution (if $Print Solution = YES$ )

Header

The header is a message indicating the start of the solver. It should look like:

-------------------------------
 E04GN, Nonlinear Data-Fitting
-------------------------------

Optional parameters list

Print Options = YES

, a list of the optional parameters and their values is printed before the problem statistics. The list shows all options of the solver, each displayed on one line. Each line contains the option name, its current value and an indicator for how it was set. The options unchanged from their defaults are noted by ‘d’ and the ones you have set are noted by ‘U’. Note that the output format is compatible with the file format expected by e04zpf. The output looks similar to:

Begin of Options
    Nldf Loss Function Type       =            Smoothl1     * U
    Nldf Huber Function Width     =         1.00000E+00     * d
    Nldf Cauchy Function Sharpness=         1.00000E+00     * d
    Nldf Smoothl1 Function Width  =         1.00000E+00     * d
    Nldf Quantile Parameter       =         5.00000E-01     * d
    Nldf Iteration Limit          =            10000000     * d
    Nldf Stop Tolerance           =         1.00000E-06     * d
    Nldf Monitor Frequency        =                   0     * d
End of Options

Problem statistics

Print Level \geq 2

, statistics on the problem are printed, for example:

Problem Statistics
  No of variables                  6
    linear                         2
    nonlinear                      4
    free (unconstrained)           4
    bounded                        2
  No of lin. constraints           1
    nonzeroes                      2
  No of quad.constraints           0
  No of nln. constraints           4
    nonzeroes                      3
  Loss function             SmoothL1
    No of residuals               24
  Regularization             L1 Norm

Iteration log

Print Level \geq 2

, the solver will print a summary line for each step. If no regularization function is specified, the output shows the iteration number, the current loss function value and the optimality measure. The output looks as follows:

--------------------------
 it|   lossfun  |  optim
--------------------------
  1  1.05863E+02  3.25E+02
  2  3.51671E+01  9.03E+01
  3  2.79091E+01  4.95E+01
  4  2.63484E+01  1.43E+01
  5  2.30795E+01  2.59E+01

If you specify a regularization type via Reg Term Type, two more columns will be printed out showing the regularization function value and the objective function value

f (x)

defined in (1). It might look as follows:

----------------------------------------------------
 it|   objfun   |   lossfun  |     reg    |  optim
----------------------------------------------------
  1  2.11489E+01  2.02833E+01  8.65649E-01  1.27E+02
  2  2.28405E+01  2.21698E+01  6.70710E-01  9.47E+01
  3  2.30351E+01  2.23599E+01  6.75246E-01  1.62E+01
  4  2.22836E+01  2.15675E+01  7.16133E-01  1.32E+01
  5  2.22676E+01  2.15522E+01  7.15411E-01  3.80E+00
  6  2.21776E+01  2.14746E+01  7.02983E-01  3.92E-01
  7  2.22075E+01  2.15060E+01  7.01510E-01  6.08E-01

Summary

Once the solver finishes, a summary is produced:

 ----------------------------------------------------
 Status: converged, an optimal solution found
 ----------------------------------------------------
 Final objective value                2.216883E+01
 Final loss function value            2.146563E+01
 Final regularization value           7.031993E-01
 Solver stopping precision            1.068791E-08
 Iterations                                     16

Optionally, if

Stats Time = YES

, the timings are printed:

 Total time                               0.19 sec

Solution

Print Solution = YES

, the values of the primal variables are printed. It might look as follows:

 Primal variables:
   idx   Lower bound       Value       Upper bound
     1  -1.00000E+00    9.69201E-02         inf
     2   0.00000E+00    7.95110E-01    1.00000E+00

10 Example

This example demonstrates how to define and solve a nonlinear regression problem using both least squares and robust regression. The regression problem consists of

nres = 24

observations,

(t, y)

, to be fitted over the

nvar = 2

parameter model

y = ϕ (t; x) + ε = t x_{1} \sin (- t x_{2}) + ε,

and the residuals for the regression are

r_{i} (x) = y_{i} - ϕ (t_{i}, x) i = 1 : nres,

where

\begin{matrix} t & = & (1.00, 2.00, 3.00, 4.00, 5.00, 6.00, 7.00, 8.00, 9.00, 10.00, \\ 11.00, 12.00, 13.00, 14.00, 15.00, 16.00, 17.00, 18.00, 19.00, 20.00, \\ 21.00, 22.00, 23.00, 24.00); \\ y & = & (0.0523, 0.1442, 0.0422, 1.8106, 0.3271, 0.4684, 0.4593, −0.0169, −0.7811, −1.1356, \\ −0.5343, −3.0043, 1.1832, 1.5153, 0.7120, −2.2923, −1.4871, −1.7083, −0.9936, −5.2873, \\ 1.7555, 2.0642, 0.9499, −0.6234) . \end{matrix}

The data above is generated by setting

x_{1} = 0.1

and

x_{2} = 0.8

with added random noise. Data point

(t_{i}, y_{i})

for

i = 4, 12, 16, 20

are set to be outliers.

The following bounds and general linear constraints are defined on the variables

\begin{matrix} - 1.0 & \leq & x_{1} \\ 0.0 & \leq & x_{2} & \leq & 1.0, \\ 0.2 & \leq & x_{1} & + & x_{2} & \leq & 1.0 \end{matrix}

Setting the initial guess to be

x_{0} = (0.3, 0.7)

, the problem is first solved using least squares loss function and the expected solution is

x^{*} = (0.0944, 0.7740)

. Then the loss function type is switched to SmoothL1 and the problem is solved again. The expected solution is

x^{*} = (0.0969, 0.7951)

, which is an improvement over least squares fit for outlier handling.

10.1 Program Text

Program Text (e04gnfe.f90)

10.2 Program Data

None.

10.3 Program Results

Program Results (e04gnfe.r)

11 Algorithmic Details

e04gnf will use the provided model, model derivative, and data to internally construct an optimization problem, and pass it off to the appropriate nonlinear programming solver. Depending on the choice of loss and regularization functions, as well as the types of constraints present, either a first-order active set method solver (e.g., e04kff) or a general nonlinear programming solver (e.g., e04stf) will be called to maximize performance.

11.1 Loss Function and Regularization Types

Loss function and regularization types can be specified by optional parameters NLDF Loss Function Type and Reg Term Type, respectively. The definitions of the available loss and regularization functions are:

•L $2$ loss function:

$χ (r_{i} (x)) = r_{i} {(x)}^{2},$ (2)

L $2$ loss involves $l_{2}$ -norm which defines the least square errors. This is the default loss function used by e04gnf.
•L $1$ loss function:

$χ (r_{i} (x)) = | r_{i} (x) |$ (3)

L $1$ loss involves $l_{1}$ -norm which defines the least absolute deviation. It is robust against outliers, while still allowing them to impact the search direction to some degree. Outliers are not handled by L $1$ loss as robustly as e.g., the Atan loss function.
•Cauchy loss function:

$χ (r_{i} (x)) = \ln (1 + {(r_{i} (x) / d_{cau})}^{2}),$ (4)

where $d_{cau}$ represents the optional parameter NLDF Cauchy Function Sharpness.

Cauchy loss is robust against outliers, more so than L $1$ loss but less than Atan loss. An advantage of using this function is that the residual derivatives (Jacobian) approach $0$ for large input, but slower than Atan loss by a factor proportional to $r_{i} {(x)}^{2}$ . The parameter $d_{cau}$ controls how sharp the Cauchy function is, with large $d_{cau}$ resulting in a more flattened curve.
•Atan loss function:

$χ (r_{i} (x)) = \arctan (r_{i} {(x)}^{2}),$ (5)

Atan tends to $π / 2$ as input tends to infinity, and its derivatives tend to $0$ . This means extreme outliers will have a negligible effect on the search direction compared to non-outliers. It also means it is important to ensure the data is well scaled and that the starting point is a reasonable guess at the true solution, if possible.
•Linf loss function:

$χ (r_{i} (x)) = \max_{j} | r_{j} (x) | / n_{res},$ (6)

Linf loss involves $l_{\infty}$ -norm which leads to the term $\sum_{i = 1}^{n_{res}} χ (r_{i} (x))$ in the objective function evaluating to $\max_{j} | r_{j} (x) |$ . This will typically be a much smaller number than for other loss functions, since only one residual is contributing. This also means that the regularization function (if present) will have a relatively larger effect; using a lower regularization coefficient can balance this out.
•Huber loss function:

$χ (r_{i} (x)) = {\begin{cases} 0.5 * r_{i} {(x)}^{2} if | r_{i} (x) | < d_{hub}, \\ d_{hub} * (| r_{i} (x) | - 0.5 * d_{hub}) otherwise, \end{cases}$ (7)

where $d_{hub}$ represents the optional parameter NLDF Huber Function Width.

Huber loss is robust against outliers, since it increases linearly with $| r_{i} (x) | \geq d_{hub}$ . It is also differentiable at $0$ . The loss function behaviour depends on the chosen value of $d_{hub}$ ; the value of $d_{hub}$ must be smaller than the difference between the outliers and the model for them to fall within the linear part of the Huber function. Increasing the parameter also increases the height and steepness of the linear part; this should be considered when comparing the final objective function value for different values of $d_{hub}$ , as well as the relative effect of regularization.

•SmoothL $1$ loss function:

χ (r_{i} (x)) = {\begin{cases} 0.5 * r_{i} {(x)}^{2} / d_{sl 1} if ​ | r_{i} (x) | < d_{sl 1}, \\ | r_{i} (x) | - 0.5 * d_{sl 1} otherwise, \end{cases}

(8)

where

d_{sl 1}

represents the optional parameter NLDF SmoothL1 Function Width.

This loss function is equal to Huber loss weighted by a factor of

1 / d_{sl 1}

. The linear part's gradient is unaffected by

d_{sl 1}

, so the relative effect of regularization will be similar over different values of

d_{sl 1}

. Functionally it is similar to L

1

loss if

d_{sl 1}

is small, and L

2

loss if

d_{sl 1}

is very large.

•Quantile loss function:

$χ (r_{i} (x)) = r_{i} (x) \cdot (d_{qnt} - I (r_{i} (x) < 0)),$ (9)

where

$I (r_{i} (x) < 0) = {\begin{cases} 1 if r_{i} (x) < 0, \\ 0 otherwise, \end{cases}$

and $d_{qnt}$ represents the optional parameter NLDF Quantile Parameter. Quantile loss function is similar to L $1$ loss, but adjustment of the parameter $d_{qnt}$ can ‘tilt’ it. For example, $d_{qnt} = 0.95$ will lead to the model being fitted above more with respect to L $1$ loss of the data, since negative residuals contribute lower loss than positive ones. $d_{qnt} = 0.05$ will have the opposite effect; the model will be fitted below more of the data, since positive residuals give lower loss than negative ones.
•L $1$ regularization:

$ψ (x_{i}) = | x_{i} | .$
•L $2$ regularization

$ψ (x_{i}) = x_{i}^{2} .$

11.2 Optimization Problem Transformations

e04gnf uses auxiliary variables and constraints to implement non-differentiable loss functions (L

1

, Quantile, Linf) and regularization functions (L

1

). For any method that uses loss function derivatives to compute a search direction, a loss function which is not differentiable will cause problems, especially if it is non-differentiable at its minima. To solve this, we transform a model with non-differentiable loss or regularization functions into a new model with differentiable constraints and a differentiable objective function.

Consider the case where we select L

1

loss and turn off regularization:

\underset{x \in ℝ^{n_{var}}}{minimize} \sum_{i = 1}^{n_{res}} | r_{i} (x) | .

This is equivalent to the problem:

\begin{array}{l} \underset{x \in ℝ^{n_{var}}, z_{i} \in ℝ}{minimize} & \sum_{i = 1}^{n_{res}} z_{i} \\ subject to & - z_{i} \leq r_{i} (x) \leq z_{i},   for ​ i = 1, \dots, n_{res}, \end{array}

where the

z_{i} > 0

are

n_{res}

new auxiliary variables.

Each

z_{i}

serves as an upper bound to

| r_{i} (x) |

, so that a minimum of

\sum_{i = 1}^{n_{res}} z_{i}

in the search space comprising

x

and

z_{i}

, is also a minimum of

\sum_{i = 1}^{n_{res}} | r_{i} (x) |

Similarly for Quantile loss with no regularization, we transform the problem:

\underset{x \in ℝ^{n_{var}}}{minimize} \sum_{i = 1}^{n_{res}} r_{i} (x) \cdot (d_{qnt} - I (r_{i} (x) < 0))

into the equivalent model:

\begin{array}{l} \underset{x \in ℝ^{n_{var}}, z_{i} \in ℝ}{minimize} & \sum_{i = 1}^{n_{res}} z_{i} \\ subject to & - z_{i} / (1 - d_{qnt}) \leq r_{i} (x) \leq z_{i} / d_{qnt},   for ​ i = 1, \dots, n_{res} . \end{array}

For Linf loss with no regularization, we transform the model:

\underset{x \in ℝ^{n_{var}}}{minimize} \max_{i} | r_{i} (x) |

into the model:

\begin{array}{l} \underset{x \in ℝ^{n_{var}}, z \in ℝ}{minimize} & z \\ subject to & - z \leq r_{i} (x) \leq z,   for ​ i = 1, \dots, n_{res} . \end{array}

If regularization is present, we transform it separately from the loss function if the regularization is non-differentiable. For example, the problem

\underset{x \in ℝ^{n_{var}}}{minimize} \sum_{i = 1}^{n_{res}} | r_{i} (x) | + ρ \sum_{i = 1}^{n_{var}} | x_{i} |

is transformed to:

\begin{array}{l} \underset{x \in ℝ^{n_{var}}, z_{i} \in ℝ, v_{i} \in ℝ}{minimize} & \sum_{i = 1}^{n_{res}} z_{i} + ρ \sum_{i = 1}^{n_{var}} v_{i} \\ subject to & - z_{i} \leq r_{i} (x) \leq z_{i},   for ​ i = 1, \dots, n_{res}, \\ - v_{i} \leq x_{i} \leq v_{i},   for ​ i = 1, \dots, n_{var}, \end{array}

where the

v_{i}

are the auxiliary variables for the regularization.

12 Optional Parameters

Several optional parameters in e04gnf define choices in the problem specification or the algorithm logic. In order to reduce the number of formal arguments of e04gnf, these optional parameters have associated default values that are appropriate for most problems. Therefore, you need only specify those optional parameters whose values are to be different from their default values.

The remainder of this section can be skipped if you wish to use the default values for all optional parameters.

The optional parameters can be changed by calling e04zmf anytime between the initialization of the handle and the call to the solver. Modification of the optional parameters during intermediate monitoring stops is not allowed. Once the solver finishes, the optional parameters can be altered again for the next solve.

The option values may be retrieved by e04znf.

The following is a list of the optional parameters available. A full description of each optional parameter is provided in Section 12.1.

12.1 Description of the Optional Parameters

For each option, we give a summary line, a description of the optional parameter and details of constraints.

The summary line contains:

the keywords, where the minimum abbreviation of each keyword is underlined;
a parameter value, where the letters $a$ , $i$ and $r$ denote options that take character, integer and real values respectively;
the default value, where the symbol $ε$ is a generic notation for machine precision (see x02ajf).

All options accept the value

DEFAULT

to return single options to their default states.

Keywords and character values are case and white space insensitive.

Defaults

This special keyword may be used to reset all optional parameters to their default values. Any value given with this keyword will be ignored.

NLDF Iteration Limit

i

Default

= 10000000

The maximum number of iterations to be performed by e04gnf. If this limit is reached, then the solver will terminate with

ifail = 22

Constraint:

NLDF Iteration Limit \geq 1

NLDF Monitor Frequency

i

Default

= 0

NLDF Monitor Frequency > 0

, the user-supplied subroutine monit will be called at the end of every

i

th step for monitoring purposes.

Constraint:

NLDF Monitor Frequency \geq 0

NLDF Stop Tolerance

r

Default

= \max (10^{−6}, \sqrt{ε})

This parameter sets the value of

ε_{tol}

which specifies the tolerance for the optimality measure.

When both loss function and regularization are differentiable, and with only simple bound constraints, the optimality measures are defined by (6) and (7) in e04kff and NLDF Stop Tolerance is passed to the solver e04kff as FOAS Stop Tolerance.

When any of the loss function or regularization is non-differentiable, or there presents linear, quadratic or general nonlinear constraint, the optimality measure is defined by (10) in e04stf and NLDF Stop Tolerance is passed to the solver e04stf as Stop Tolerance 1.

Constraint:

0 \leq ε_{tol} < 1

NLDF Loss Function Type

a

Default

=

L2

This parameter sets the loss function type used in the objective function.

Constraint:

NLDF Loss Function Type = HUBER

L2

CAUCHY

ATAN

SMOOTHL1

LINF

L1

QUANTILE

Reg Term Type

a

Default

=

OFF

This parameter sets the regularization function type used in the objective function.

Note: if there is no residual in the model, regularization will be turned off.

Constraint:

Reg Term Type = OFF

L2

L1

NLDF Huber Function Width

r

Default

= 1.0

Sets the parameter

d_{hub}

defined in the Huber loss function (7).

Constraint:

NLDF Huber Function Width > 0

NLDF Cauchy Function Sharpness

r

Default

= 1.0

Sets the parameter

d_{cau}

defined in the Cauchy loss function (4).

Constraint:

NLDF Cauchy Function Sharpness > 0

NLDF SmoothL1 Function Width

r

Default

= 1.0

Sets the parameter

d_{sl 1}

defined in the SmoothL1 loss function (8).

Constraint:

NLDF SmoothL1 Function Width > 0

NLDF Quantile Parameter

r

Default

= 0.5

Sets the parameter

d_{qnt}

defined in the Quantile loss function (9).

Constraint:

0 < NLDF Quantile Parameter < 1

Reg Coefficient

r

Default

= 1.0

Sets the regularization coefficient

ρ

in the definition of the objective function (1). Note: if set to

0

, regularization will be turned off.

Constraint:

Reg Coefficient \geq 0

Infinite Bound Size

r

Default

= 10^{20}

This defines the ‘infinite’ bound

bigbnd

in the definition of the problem constraints. Any upper bound greater than or equal to

bigbnd

will be regarded as

+ \infty

(and similarly any lower bound less than or equal to

- bigbnd

will be regarded as

- \infty

). Note that a modification of this optional parameter does not influence constraints which have already been defined; only the constraints formulated after the change will be affected.

Constraint:

Infinite Bound Size \geq 1000

Monitoring File

i

Default

= −1

i \geq 0

, the unit number for the secondary (monitoring) output. If

Monitoring File = −1

, no secondary output is provided. The information output to this unit is controlled by Monitoring Level.

Constraint:

Monitoring File \geq −1

Monitoring Level

i

Default

= 4

This parameter sets the amount of information detail that will be printed by the solver to the secondary output. The meaning of the levels is the same as for Print Level.

Constraint:

0 \leq Monitoring Level \leq 5

Print File

i

Default

= advisory message unit number

i \geq 0

, the unit number for the primary output of the solver. If

Print File = −1

, the primary output is completely turned off independently of other settings. The default value is the advisory message unit number as defined by x04abf at the time of the initialization of the optional parameters, e.g., at the initialization of the handle. The information output to this unit is controlled by Print Level.

Constraint:

Print File \geq −1

Print Level

i

Default

= 2

This parameter defines how detailed information should be printed by the solver to the primary and secondary output.

$i$	Output
$0$	No output from the solver.
$1$	The Header and Summary.
$2$ , $3$ , $4$ , $5$	Additionally, the Iteration log.

Constraint:

0 \leq Print Level \leq 5

Print Options

a

Default

= YES

Print Options = YES

, a listing of optional parameters will be printed to the primary output and is always printed to the secondary output.

Constraint:

Print Options = YES

NO

Print Solution

a

Default

= NO

Print Solution = YES

X

, the final values of the primal variables are printed on the primary and secondary outputs.

Constraint:

Print Solution = YES

NO

X

Stats Time

a

Default

= NO

This parameter turns on timing. This might be helpful for a choice of different solving approaches. It is possible to choose between CPU and wall clock time. Choice

YES

is equivalent to

WALL CLOCK

Constraint:

Stats Time = YES

NO

CPU

WALL CLOCK

Time Limit

r

Default

= 10^{6}

A limit to the number of seconds that the solver can use to solve one problem. If at the end of an iteration this limit is exceeded, the solver will terminate with