c05qsf is based on the MINPACK routine HYBRD1 (see Moré et al. (1980)). It chooses the correction at each step as a convex combination of the Newton and scaled gradient directions. The Jacobian is updated by the sparse rank-1 method of Schubert (see Schubert (1970)). At the starting point, the sparsity pattern is determined and the Jacobian is approximated by forward differences, but these are not used again until the rank-1 method fails to produce satisfactory progress. Then, the sparsity structure is used to recompute an approximation to the Jacobian by forward differences with the least number of function evaluations. The subroutine you supply must be able to compute only the requested subset of the function values. The sparse Jacobian linear system is solved at each iteration with f11mef computing the Newton step. For more details see Powell (1970) and Broyden (1965).
4References
Broyden C G (1965) A class of methods for solving nonlinear simultaneous equations Mathematics of Computation19(92) 577–593
Moré J J, Garbow B S and Hillstrom K E (1980) User guide for MINPACK-1 Technical Report ANL-80-74 Argonne National Laboratory
Powell M J D (1970) A hybrid method for nonlinear algebraic equations Numerical Methods for Nonlinear Algebraic Equations (ed P Rabinowitz) Gordon and Breach
Schubert L K (1970) Modification of a quasi-Newton method for nonlinear equations with a sparse Jacobian Mathematics of Computation24(109) 27–30
5Arguments
1: $\mathbf{fcn}$ – Subroutine, supplied by the user.External Procedure
fcn must return the values of the functions ${f}_{i}$ at a point $x$.
On entry: indf specifies the indices $i$ for which values of ${f}_{i}\left(x\right)$ must be computed. The indices are specified in strictly ascending order.
4: $\mathbf{x}\left({\mathbf{n}}\right)$ – Real (Kind=nag_wp) arrayInput
On entry: the components of the point $x$ at which the functions must be evaluated. ${\mathbf{x}}\left(i\right)$ contains the coordinate ${x}_{i}$.
5: $\mathbf{fvec}\left({\mathbf{n}}\right)$ – Real (Kind=nag_wp) arrayOutput
On exit: ${\mathbf{fvec}}\left(i\right)$ must contain the function values ${f}_{i}\left(x\right)$, for all indices $i$ in indf.
7: $\mathbf{ruser}\left(*\right)$ – Real (Kind=nag_wp) arrayUser Workspace
fcn is called with the arguments iuser and ruser as supplied to c05qsf. You should use the arrays iuser and ruser to supply information to fcn.
8: $\mathbf{iflag}$ – IntegerInput/Output
On entry: ${\mathbf{iflag}}>0$.
On exit: in general, iflag should not be reset by fcn. If, however, you wish to terminate execution (perhaps because some illegal point x has been reached), iflag should be set to a negative integer.
fcn must either be a module subprogram USEd by, or declared as EXTERNAL in, the (sub)program from which c05qsf is called. Arguments denoted as Input must not be changed by this procedure.
Note:fcn should not return floating-point NaN (Not a Number) or infinity values, since these are not handled by c05qsf. If your code inadvertently does return any NaNs or infinities, c05qsf is likely to produce unexpected results.
2: $\mathbf{n}$ – IntegerInput
On entry: $n$, the number of equations.
Constraint:
${\mathbf{n}}>0$.
3: $\mathbf{x}\left({\mathbf{n}}\right)$ – Real (Kind=nag_wp) arrayInput/Output
On entry: an initial guess at the solution vector. ${\mathbf{x}}\left(i\right)$ must contain the coordinate ${x}_{i}$.
On exit: the final estimate of the solution vector.
4: $\mathbf{fvec}\left({\mathbf{n}}\right)$ – Real (Kind=nag_wp) arrayOutput
On exit: the function values at the final point returned in x. ${\mathbf{fvec}}\left(i\right)$ contains the function values ${f}_{i}$.
5: $\mathbf{xtol}$ – Real (Kind=nag_wp)Input
On entry: the accuracy in x to which the solution is required.
Suggested value:
$\sqrt{\epsilon}$, where $\epsilon $ is the machine precision returned by x02ajf.
Constraint:
${\mathbf{xtol}}\ge 0.0$.
6: $\mathbf{init}$ – LogicalInput
On entry: init must be set to .TRUE. to indicate that this is the first time c05qsf is called for this specific problem. c05qsf then computes the dense Jacobian and detects and stores its sparsity pattern (in rcomm and icomm) before proceeding with the iterations. This is noticeably time consuming when n is large. If not enough storage has been provided for rcomm or icomm, c05qsf will fail. On exit with ${\mathbf{ifail}}={\mathbf{0}}$, ${\mathbf{2}}$, ${\mathbf{3}}$ or ${\mathbf{4}}$, ${\mathbf{icomm}}\left(1\right)$ contains $\mathit{nnz}$, the number of nonzero entries found in the Jacobian. On subsequent calls, init can be set to .FALSE. if the problem has a Jacobian of the same sparsity pattern. In that case, the computation time required for the detection of the sparsity pattern will be smaller.
7: $\mathbf{rcomm}\left({\mathbf{lrcomm}}\right)$ – Real (Kind=nag_wp) arrayCommunication Array
rcommmust not be altered between successive calls to c05qsf.
8: $\mathbf{lrcomm}$ – IntegerInput
On entry: the dimension of the array rcomm as declared in the (sub)program from which c05qsf is called.
Constraint:
${\mathbf{lrcomm}}\ge 12+\mathit{nnz}$ where $\mathit{nnz}$ is the number of nonzero entries in the Jacobian, as computed by c05qsf.
If ${\mathbf{ifail}}={\mathbf{0}}$, ${\mathbf{2}}$, ${\mathbf{3}}$ or ${\mathbf{4}}$ on exit, ${\mathbf{icomm}}\left(1\right)$ contains $\mathit{nnz}$ where $\mathit{nnz}$ is the number of nonzero entries in the Jacobian.
icommmust not be altered between successive calls to c05qsf.
10: $\mathbf{licomm}$ – IntegerInput
On entry: the dimension of the array icomm as declared in the (sub)program from which c05qsf is called.
Constraint:
${\mathbf{licomm}}\ge 8\times {\mathbf{n}}+19+\mathit{nnz}$ where $\mathit{nnz}$ is the number of nonzero entries in the Jacobian, as computed by c05qsf.
12: $\mathbf{ruser}\left(*\right)$ – Real (Kind=nag_wp) arrayUser Workspace
iuser and ruser are not used by c05qsf, but are passed directly to fcn and may be used to pass information to this routine.
13: $\mathbf{ifail}$ – IntegerInput/Output
On entry: ifail must be set to $0$, $\mathrm{-1}$ or $1$ to set behaviour on detection of an error; these values have no effect when no error is detected.
A value of $0$ causes the printing of an error message and program execution will be halted; otherwise program execution continues. A value of $\mathrm{-1}$ means that an error message is printed while a value of $1$ means that it is not.
If halting is not appropriate, the value $\mathrm{-1}$ or $1$ is recommended. If message printing is undesirable, then the value $1$ is recommended. Otherwise, the value $0$ is recommended. When the value $-\mathbf{1}$ or $\mathbf{1}$ is used it is essential to test the value of ifail on exit.
On exit: ${\mathbf{ifail}}={\mathbf{0}}$ unless the routine detects an error or a warning has been flagged (see Section 6).
6Error Indicators and Warnings
If on entry ${\mathbf{ifail}}=0$ or $\mathrm{-1}$, explanatory error messages are output on the current error message unit (as defined by x04aaf).
Errors or warnings detected by the routine:
${\mathbf{ifail}}=2$
There have been at least $200\times ({\mathbf{n}}+1)$ calls to fcn. Consider setting ${\mathbf{init}}=\mathrm{.FALSE.}$ and restarting the calculation from the point held in x.
${\mathbf{ifail}}=3$
No further improvement in the solution is possible. xtol is too small: ${\mathbf{xtol}}=\u27e8\mathit{\text{value}}\u27e9$.
${\mathbf{ifail}}=4$
The iteration is not making good progress. This failure exit may indicate that the system does not have a zero, or that the solution is very close to the origin (see Section 7). Otherwise, rerunning c05qsf from a different starting point may avoid the region of difficulty. The condition number of the Jacobian is $\u27e8\mathit{\text{value}}\u27e9$.
If this condition is satisfied with ${\mathbf{xtol}}={10}^{-k}$, then the larger components of $x$ have $k$ significant decimal digits. There is a danger that the smaller components of $x$ may have large relative errors, but the fast rate of convergence of c05qsf usually obviates this possibility.
If xtol is less than machine precision and the above test is satisfied with the machine precision in place of xtol, then the routine exits with ${\mathbf{ifail}}={\mathbf{3}}$.
Note: this convergence test is based purely on relative error, and may not indicate convergence if the solution is very close to the origin.
The convergence test assumes that the functions are reasonably well behaved. If this condition is not satisfied, then c05qsf may incorrectly indicate convergence. The validity of the answer can be checked, for example, by rerunning c05qsf with a lower value for xtol.
8Parallelism and Performance
Background information to multithreading can be found in the Multithreading documentation.
c05qsf is threaded by NAG for parallel execution in multithreaded implementations of the NAG Library.
c05qsf makes calls to BLAS and/or LAPACK routines, which may be threaded within the vendor library used by this implementation. Consult the documentation for the vendor library for further information.
Please consult the X06 Chapter Introduction for information on how to control and interrogate the OpenMP environment used within this routine. Please also consult the Users' Note for your implementation for any additional implementation-specific information.
9Further Comments
Local workspace arrays of fixed lengths are allocated internally by c05qsf. The total size of these arrays amounts to $8\times n+2\times q$ real elements and $10\times n+2\times q+5$ integer elements where the integer $q$ is bounded by $8\times \mathit{nnz}$ and ${n}^{2}$ and depends on the sparsity pattern of the Jacobian.
The time required by c05qsf to solve a given problem depends on $n$, the behaviour of the functions, the accuracy requested and the starting point. The number of arithmetic operations executed by c05qsf to process each evaluation of the functions depends on the number of nonzero entries in the Jacobian. The timing of c05qsf is strongly influenced by the time spent evaluating the functions.
When init is .TRUE., the dense Jacobian is first evaluated and that will take time proportional to ${n}^{2}$.
Ideally the problem should be scaled so that, at the solution, the function values are of comparable magnitude.
10Example
This example determines the values ${x}_{1},\dots ,{x}_{9}$ which satisfy the tridiagonal equations: