NAG Library Routine Document
d03uaf (dim2_ellip_fd_iter)
1
Purpose
d03uaf performs at each call one iteration of the Strongly Implicit Procedure. It is used to calculate on successive calls a sequence of approximate corrections to the current estimate of the solution when solving a system of simultaneous algebraic equations for which the iterative update matrix is of fivepoint molecule form on a twodimensional topologicallyrectangular mesh. (‘Topological’ means that a polar grid $\left(r,\theta \right)$, for example, can be used as it is equivalent to a rectangular box.)
2
Specification
Fortran Interface
Subroutine d03uaf ( 
n1, n2, lda, a, b, c, d, e, aparam, it, r, wrksp1, wrksp2, ifail) 
Integer, Intent (In)  ::  n1, n2, lda, it  Integer, Intent (Inout)  ::  ifail  Real (Kind=nag_wp), Intent (In)  ::  a(lda,n2), b(lda,n2), c(lda,n2), d(lda,n2), e(lda,n2), aparam  Real (Kind=nag_wp), Intent (Inout)  ::  r(lda,n2), wrksp1(lda,n2), wrksp2(lda,n2) 

C Header Interface
#include <nagmk26.h>
void 
d03uaf_ (const Integer *n1, const Integer *n2, const Integer *lda, const double a[], const double b[], const double c[], const double d[], const double e[], const double *aparam, const Integer *it, double r[], double wrksp1[], double wrksp2[], Integer *ifail) 

3
Description
Given a set of simultaneous equations
(which could be nonlinear) derived, for example, from a finite difference representation of a twodimensional elliptic partial differential equation and its boundary conditions, the solution
$t$ may be obtained iteratively from a starting approximation
${t}^{\left(1\right)}$ by the formulae
Thus
${r}^{\left(n\right)}$ is the residual of the
$n$th approximate solution
${t}^{\left(n\right)}$, and
${s}^{\left(n\right)}$ is the update change vector.
d03uaf determines the approximate change vector
$s$ corresponding to a given residual
$r$, i.e., it determines an approximate solution to a set of equations
where
$M$ is a square
$\left({n}_{1}\times {n}_{2}\right)$ by
$\left({n}_{1}\times {n}_{2}\right)$ matrix and
$r$ is a known vector of length
${n}_{1}\times {n}_{2}$. The set of equations
(2) must be of fivediagonal form
for
$i=1,2,\dots ,{n}_{1}$ and
$j=1,2,\dots ,{n}_{2}$, provided that
${c}_{ij}\ne 0.0$. Indeed, if
${c}_{ij}=0.0$, then the equation is assumed to be
For example, if
${n}_{1}=3$ and
${n}_{2}=2$, the equations take the form
The calling program supplies the current residual
$r$ at each iteration and the coefficients of the fivepoint molecule system of equations on which the update procedure is based. The routine performs one iteration, using the approximate
$LU$ factorization of the Strongly Implicit Procedure with the necessary acceleration argument adjustment, to calculate the approximate solution
$s$ of the set of equations
(2). The change
$s$ overwrites the residual array for return to the calling program. The calling program must combine this change stored in
$r$ with the old approximation to obtain the new approximate solution for
$t$. It must then recalculate the residuals and, if the accuracy requirements have not been satisfied, commence the next iterative cycle.
Clearly there is no requirement that the iterative update matrix passed in the form of the fivediagonal element arrays
a,
b,
c,
d and
e is the same as that used to calculate the residuals, and therefore the one governing the problem. However, the convergence may be impaired if they are not equal. Indeed, if the system of equations
(1) is not precisely of the fivediagonal form illustrated above but has a few additional terms, then the methods of deferred or defect correction can be employed. The residual is calculated by the calling program using the full system of equations, but the update formula is based on a fivediagonal system
(2) of the form given above. For example, the solution of a system of ninediagonal equations each involving the combination of terms with
${t}_{i\pm 1,j\pm 1},{t}_{i\pm 1,j},{t}_{i,j\pm 1}$ and
${t}_{ij}$ could use the fivediagonal coefficients on which to base the update, provided these incorporate the major features of the equations.
Problems in topologically nonrectangular regions can be solved using the routine by surrounding the region with a circumscribing topological rectangle. The equations for the nodal values external to the region of interest are set to zero (i.e., ${c}_{ij}={r}_{ij}=0$) and the boundary conditions are incorporated into the equations for the appropriate nodes.
If there is no better initial approximation when starting the iterative cycle, one can use an array of all zeros as the initial approximation from which the first set of residuals are determined.
The routine can be used to solve linear elliptic equations in which case the arrays
a,
b,
c,
d,
e and the quantities
$q$ will be unchanged during the iterative cycles, or for solving nonlinear elliptic equations in which case some or all of these arrays may require updating as each new approximate solution is derived. Depending on the nonlinearity, some underrelaxation of the coefficients and/or source terms may be needed during their recalculation using the new estimates of the solution (see
Jacobs (1972)).
The routine can also be used to solve each step of a timedependent parabolic equation in two space dimensions. The solution at each time step can be expressed in terms of an elliptic equation if the Crank–Nicolson or other form of implicit time integration is used.
Neither diagonal dominance, nor positivedefiniteness, of the matrix
$M$ or of the update matrix formed from the arrays
a,
b,
c,
d and
e is necessary to ensure convergence.
For problems in which the solution is not unique, in the sense that an arbitrary constant can be added to the solution (for example Laplace's equation with all Neumann boundary conditions), the calling program should subtract a typical nodal value from the whole solution $t$ at every iteration to keep rounding errors to a minimum.
4
References
Ames W F (1977) Nonlinear Partial Differential Equations in Engineering (2nd Edition) Academic Press
Jacobs D A H (1972) The strongly implicit procedure for the numerical solution of parabolic and elliptic partial differential equations Note RD/L/N66/72 Central Electricity Research Laboratory
Stone H L (1968) Iterative solution of implicit approximations of multidimensional partial differential equations SIAM J. Numer. Anal. 5 530–558
5
Arguments
 1: $\mathbf{n1}$ – IntegerInput

On entry: the number of nodes in the first coordinate direction, ${n}_{1}$.
Constraint:
${\mathbf{n1}}>1$.
 2: $\mathbf{n2}$ – IntegerInput

On entry: the number of nodes in the second coordinate direction, ${n}_{2}$.
Constraint:
${\mathbf{n2}}>1$.
 3: $\mathbf{lda}$ – IntegerInput

On entry: the first dimension of the arrays
a,
b,
c,
d,
e,
r,
wrksp1 and
wrksp2 as declared in the (sub)program from which
d03uaf is called.
Constraint:
${\mathbf{lda}}\ge {\mathbf{n1}}$.
 4: $\mathbf{a}\left({\mathbf{lda}},{\mathbf{n2}}\right)$ – Real (Kind=nag_wp) arrayInput

On entry:
${\mathbf{a}}\left(\mathit{i},\mathit{j}\right)$ must contain the coefficient of the ‘southerly’ term involving
${s}_{i,j1}$ in the
$\left(i,j\right)$th equation of the system
(2), for
$\mathit{i}=1,2,\dots ,{\mathbf{n1}}$ and
$\mathit{j}=1,2,\dots ,{\mathbf{n2}}$. The elements of
a, for
$j=1$, must be zero after incorporating the boundary conditions, since they involve nodal values from outside the rectangle.
 5: $\mathbf{b}\left({\mathbf{lda}},{\mathbf{n2}}\right)$ – Real (Kind=nag_wp) arrayInput

On entry:
${\mathbf{b}}\left(\mathit{i},\mathit{j}\right)$ must contain the coefficient of the ‘westerly’ term involving
${s}_{i1,j}$ in the
$\left(i,j\right)$th equation of the system
(2), for
$\mathit{i}=1,2,\dots ,{\mathbf{n1}}$ and
$\mathit{j}=1,2,\dots ,{\mathbf{n2}}$. The elements of
b, for
$i=1$, must be zero after incorporating the boundary conditions, since they involve nodal values from outside the rectangle.
 6: $\mathbf{c}\left({\mathbf{lda}},{\mathbf{n2}}\right)$ – Real (Kind=nag_wp) arrayInput

On entry:
${\mathbf{c}}\left(\mathit{i},\mathit{j}\right)$ must contain the coefficient of the ‘central’ term involving
${s}_{\mathit{i}\mathit{j}}$ in the
$\left(\mathit{i},\mathit{j}\right)$th equation of the system
(2), for
$\mathit{i}=1,2,\dots ,{\mathbf{n1}}$ and
$\mathit{j}=1,2,\dots ,{\mathbf{n2}}$. The elements of
c are checked to ensure that they are nonzero. If any element is found to be zero, the corresponding algebraic equation is assumed to be
${s}_{\mathit{i}\mathit{j}}={r}_{\mathit{i}\mathit{j}}$. This feature can be used to define the equations for nodes at which, for example, Dirichlet boundary conditions are applied, or for nodes external to the problem of interest, by setting
${\mathbf{c}}\left(\mathit{i},\mathit{j}\right)=0.0$ at appropriate points. The corresponding value of
${\mathbf{r}}\left(\mathit{i},\mathit{j}\right)$ is set equal to the appropriate value, namely the difference between the prescribed value of
${t}_{\mathit{i}\mathit{j}}$ and the current value of
${t}_{\mathit{i}\mathit{j}}$ in the Dirichlet case, or zero at an external point.
 7: $\mathbf{d}\left({\mathbf{lda}},{\mathbf{n2}}\right)$ – Real (Kind=nag_wp) arrayInput

On entry:
${\mathbf{d}}\left(\mathit{i},\mathit{j}\right)$ must contain the coefficient of the ‘easterly’ term involving
${s}_{\mathit{i}+1,\mathit{j}}$ in the
$\left(\mathit{i},\mathit{j}\right)$th equation of the system
(2), for
$\mathit{i}=1,2,\dots ,{\mathbf{n1}}$ and
$\mathit{j}=1,2,\dots ,{\mathbf{n2}}$. The elements of
d, for
$i={\mathbf{n1}}$, must be zero after incorporating the boundary conditions, since they involve nodal values from outside the rectangle.
 8: $\mathbf{e}\left({\mathbf{lda}},{\mathbf{n2}}\right)$ – Real (Kind=nag_wp) arrayInput

On entry:
${\mathbf{e}}\left(\mathit{i},\mathit{j}\right)$ must contain the coefficient of the ‘northerly’ term involving
${s}_{\mathit{i},\mathit{j}+1}$ in the
$\left(\mathit{i},\mathit{j}\right)$th equation of the system
(2), for
$\mathit{i}=1,2,\dots ,{\mathbf{n1}}$ and
$\mathit{j}=1,2,\dots ,{\mathbf{n2}}$. The elements of
e, for
$j={\mathbf{n2}}$, must be zero after incorporating the boundary conditions, since they involve nodal values from outside the rectangle.
 9: $\mathbf{aparam}$ – Real (Kind=nag_wp)Input

On entry: the iteration acceleration factor. A value of $1.0$ is adequate for most typical problems. However, if convergence is slow, the value can be reduced, typically to $0.2$ or $0.1$. If divergence is obtained, the value can be increased, typically to $2.0$, $5.0$ or $10.0$.
Constraint:
$0.0<{\mathbf{aparam}}\le \left({\left({\mathbf{n1}}1\right)}^{2}+{\left({\mathbf{n2}}1\right)}^{2}\right)/2.0$.
 10: $\mathbf{it}$ – IntegerInput

On entry: the iteration number. It must be initialized, but not necessarily to
$1$, before the first call, and must be incremented by one in the calling program for each subsequent call.
d03uaf uses the counter to select the appropriate acceleration argument from a sequence of nine, each one being used twice in succession. (Note that the acceleration argument depends on the value of
aparam.)
 11: $\mathbf{r}\left({\mathbf{lda}},{\mathbf{n2}}\right)$ – Real (Kind=nag_wp) arrayInput/Output

On entry:
${\mathbf{r}}\left(\mathit{i},\mathit{j}\right)$ must contain the current residual
${r}_{\mathit{i}\mathit{j}}$ on the righthand side of the
$\left(\mathit{i},\mathit{j}\right)$th equation of the system
(2), for
$\mathit{i}=1,2,\dots ,{\mathbf{n1}}$ and
$\mathit{j}=1,2,\dots ,{\mathbf{n2}}$.
On exit: these residuals are overwritten by the corresponding components of solution
$s$ to the system
(2), i.e., the changes to be made to the vector
$t$ to reduce the residuals supplied.
 12: $\mathbf{wrksp1}\left({\mathbf{lda}},{\mathbf{n2}}\right)$ – Real (Kind=nag_wp) arrayWorkspace
 13: $\mathbf{wrksp2}\left({\mathbf{lda}},{\mathbf{n2}}\right)$ – Real (Kind=nag_wp) arrayWorkspace

 14: $\mathbf{ifail}$ – IntegerInput/Output

On entry:
ifail must be set to
$0$,
$1\text{or}1$. If you are unfamiliar with this argument you should refer to
Section 3.4 in How to Use the NAG Library and its Documentation for details.
For environments where it might be inappropriate to halt program execution when an error is detected, the value
$1\text{or}1$ is recommended. If the output of error messages is undesirable, then the value
$1$ is recommended. Otherwise, if you are not familiar with this argument, the recommended value is
$0$.
When the value $\mathbf{1}\text{or}\mathbf{1}$ is used it is essential to test the value of ifail on exit.
On exit:
${\mathbf{ifail}}={\mathbf{0}}$ unless the routine detects an error or a warning has been flagged (see
Section 6).
6
Error Indicators and Warnings
If on entry
${\mathbf{ifail}}=0$ or
$1$, explanatory error messages are output on the current error message unit (as defined by
x04aaf).
Errors or warnings detected by the routine:
 ${\mathbf{ifail}}=1$

On entry, ${\mathbf{n1}}=\u2329\mathit{\text{value}}\u232a$.
Constraint: ${\mathbf{n1}}>1$.
On entry, ${\mathbf{n2}}=\u2329\mathit{\text{value}}\u232a$.
Constraint: ${\mathbf{n2}}>1$.
 ${\mathbf{ifail}}=2$

On entry, ${\mathbf{lda}}=\u2329\mathit{\text{value}}\u232a$ and ${\mathbf{n1}}=\u2329\mathit{\text{value}}\u232a$.
Constraint: ${\mathbf{lda}}\ge {\mathbf{n1}}$.
 ${\mathbf{ifail}}=3$

On entry, ${\mathbf{aparam}}=\u2329\mathit{\text{value}}\u232a$.
Constraint: ${\mathbf{aparam}}>0.0$.
 ${\mathbf{ifail}}=4$

On entry, ${\mathbf{aparam}}=\u2329\mathit{\text{value}}\u232a$, ${\mathbf{n1}}=\u2329\mathit{\text{value}}\u232a$ and ${\mathbf{n2}}=\u2329\mathit{\text{value}}\u232a$.
Constraint: ${\mathbf{aparam}}\le \left({\left({\mathbf{n1}}1\right)}^{2}+{\left({\mathbf{n2}}1\right)}^{2}\right)/2$.
 ${\mathbf{ifail}}=99$
An unexpected error has been triggered by this routine. Please
contact
NAG.
See
Section 3.9 in How to Use the NAG Library and its Documentation for further information.
 ${\mathbf{ifail}}=399$
Your licence key may have expired or may not have been installed correctly.
See
Section 3.8 in How to Use the NAG Library and its Documentation for further information.
 ${\mathbf{ifail}}=999$
Dynamic memory allocation failed.
See
Section 3.7 in How to Use the NAG Library and its Documentation for further information.
7
Accuracy
The improvement in accuracy for each iteration, i.e., on each call, depends on the size of the system and on the condition of the update matrix characterised by the fivediagonal coefficient arrays. The ultimate accuracy obtainable depends on the above factors and on the machine precision. However, since d03uaf works with residuals and the update vector, the calling program can, in most cases where at each iteration all the residuals are usually of about the same size, calculate the residuals from extended precision values of the function, source term and equation coefficients if greater accuracy is required. The rate of convergence obtained with the Strongly Implicit Procedure is not always smooth because of the cyclic use of nine acceleration arguments. The convergence may become slow with very large problems. The final accuracy obtained can be judged approximately from the rate of convergence determined from the changes to the dependent variable $t$ and in particular the change on the last iteration.
8
Parallelism and Performance
d03uaf is not threaded in any implementation.
The time taken is approximately proportional to ${\mathbf{n1}}\times {\mathbf{n2}}$ for each call.
When used with deferred or defect correction, the residual is calculated in the calling program from a different system of equations to those represented by the fivepoint molecule coefficients used by
d03uaf as the basis of the iterative update procedure. When using deferred correction the overall rate of convergence depends not only on the items detailed in
Section 7 but also on the difference between the two coefficient matrices used.
Convergence may not always be obtained when the problem is very large and/or the coefficients of the equations have widely disparate values. The latter case may be associated with an illconditioned matrix.
10
Example
This example solves Laplace's equation in a rectangle with a nonuniform grid spacing in the $x$ and $y$ coordinate directions and with Dirichlet boundary conditions specifying the function on the perimeter of the rectangle equal to ${e}^{\left(1.0+x\right)/y\left({n}_{2}\right)}\times \mathrm{cos}\left(y/y\left({n}_{2}\right)\right)$.
10.1
Program Text
Program Text (d03uafe.f90)
10.2
Program Data
Program Data (d03uafe.d)
10.3
Program Results
Program Results (d03uafe.r)