NAG Library Routine Document

F11DTF

+− Contents

1 Purpose

2 Specification

3 Description

4 References

5 Parameters

6 Error Indicators and Warnings

7 Accuracy

8 Further Comments

+− 9 Example

9.1 Program Text

9.2 Program Data

9.3 Program Results

1 Purpose

F11DTF computes a block diagonal incomplete

L U

factorization of a complex sparse non-Hermitian matrix, represented in coordinate storage format. The diagonal blocks may be composed of arbitrary rows and the corresponding columns, and may overlap. This factorization can be used to provide a block Jacobi or additive Schwarz preconditioner, for use in combination with F11BSF or F11DUF.

2 Specification

SUBROUTINE F11DTF (

N, NNZ, A, LA, IROW, ICOL, NB, ISTB, INDB, LINDB, LFILL, DTOL, PSTRAT, MILU, IPIVP, IPIVQ, ISTR, IDIAG, NNZC, NPIVM, IWORK, LIWORK, IFAIL)

INTEGER	N, NNZ, LA, IROW(LA), ICOL(LA), NB, ISTB(NB+1), INDB(LINDB), LINDB, LFILL(NB), IPIVP(LINDB), IPIVQ(LINDB), ISTR(LINDB+1), IDIAG(LINDB), NNZC, NPIVM(NB), IWORK(LIWORK), LIWORK, IFAIL
REAL (KIND=nag_wp)	DTOL(NB)
COMPLEX (KIND=nag_wp)	A(LA)
CHARACTER(1)	PSTRAT(NB), MILU(NB)

3 Description

F11DTF computes an incomplete

L U

factorization (see Meijerink and Van der Vorst (1977) and Meijerink and Van der Vorst (1981)) of the (possibly overlapping) diagonal blocks

A_{b}

b = 1, 2, \dots, NB

, of a complex sparse non-Hermitian

n

n

matrix

A

. The factorization is intended primarily for use as a block Jacobi or additive Schwarz preconditioner (see Saad (1996)), with one of the iterative solvers F11BSF and F11DUF.

The NB diagonal blocks need not consist of consecutive rows and columns of

A

, but may be composed of arbitrarily indexed rows, and the corresponding columns, as defined in the arguments INDB and ISTB. Any given row or column index may appear in more than one diagonal block, resulting in overlap. Each diagonal block

A_{b}

b = 1, 2, \dots, NB

, is factorized as:

A_{b} = M_{b} + R_{b}

where

M_{b} = P_{b} L_{b} D_{b} U_{b} Q_{b}

and

L_{b}

is lower triangular with unit diagonal elements,

D_{b}

is diagonal,

U_{b}

is upper triangular with unit diagonals,

P_{b}

and

Q_{b}

are permutation matrices, and

R_{b}

is a remainder matrix.

The amount of fill-in occurring in the factorization of block

b

can vary from zero to complete fill, and can be controlled by specifying either the maximum level of fill

LFILL (b)

, or the drop tolerance

DTOL (b)

The parameter

PSTRAT (b)

defines the pivoting strategy to be used in block

b

. The options currently available are no pivoting, user-defined pivoting, partial pivoting by columns for stability, and complete pivoting by rows for sparsity and by columns for stability. The factorization may optionally be modified to preserve the row-sums of the original block matrix.

The sparse matrix

A

is represented in coordinate storage (CS) format (see Section 2.1.1 in the F11 Chapter Introduction). The array A stores all the nonzero elements of the matrix

A

, while arrays IROW and ICOL store the corresponding row and column indices respectively. Multiple nonzero elements may not be specified for the same row and column index.

The preconditioning matrices

M_{b}

b = 1, 2, \dots, NB

, are returned in terms of the CS representations of the matrices

C_{b} = L_{b} + {D^{- 1}}_{b} + U_{b} - 2 I .

4 References

Meijerink J and Van der Vorst H (1977) An iterative solution method for linear systems of which the coefficient matrix is a symmetric M-matrix Math. Comput. 31 148–162

Meijerink J and Van der Vorst H (1981) Guidelines for the usage of incomplete decompositions in solving sets of linear equations as they occur in practical problems J. Comput. Phys. 44 134–155

Saad Y (1996) Iterative Methods for Sparse Linear Systems PWS Publishing Company, Boston, MA

5 Parameters

1: N – INTEGERInput

On entry:

n

, the order of the matrix

A

Constraint:

N \geq 1

2: NNZ – INTEGERInput

On entry: the number of nonzero elements in the matrix

A

Constraint:

1 \leq NNZ \leq N^{2}

3: A(LA) – COMPLEX (KIND=nag_wp) arrayInput/Output

On entry: the nonzero elements in the matrix

A

, ordered by increasing row index, and by increasing column index within each row. Multiple entries for the same row and column indices are not permitted. The routine F11ZNF may be used to order the elements in this way.

On exit: the first NNZ entries of A contain the nonzero elements of

A

and the next NNZC entries contain the elements of the matrices

C_{b}

, for

b = 1, 2, \dots, NB

stored consecutively. Within each block the matrix elements are ordered by increasing row index, and by increasing column index within each row.

4: LA – INTEGERInput

On entry: the dimension of the arrays A, IROW and ICOL as declared in the (sub)program from which F11DTF is called. These arrays must be of sufficient size to store both

A

(NNZ elements) and

C

(NNZC elements).

Constraint:

LA \geq 2 \times NNZ

5: IROW(LA) – INTEGER arrayInput/Output

6: ICOL(LA) – INTEGER arrayInput/Output

On entry: the row and column indices of the nonzero elements supplied in A.

Constraints:

IROW and ICOL must satisfy these constraints (which may be imposed by a call to F11ZAF):

$1 \leq IROW (i) \leq N$ and $1 \leq ICOL (i) \leq N$ , for $i = 1, 2, \dots, NNZ$ ;
either $IROW (i - 1) < IROW (i)$ or both $IROW (i - 1) = IROW (i)$ and $ICOL (i - 1) < ICOL (i)$ , for $i = 2, 3, \dots, NNZ$ .

On exit: the row and column indices of the nonzero elements returned in A.

7: NB – INTEGERInput

On entry: the number of diagonal blocks to factorize.

Constraint:

1 \leq NB \leq N

8: ISTB( $NB + 1$ ) – INTEGER arrayInput

On entry:

ISTB (b)

, for

b = 1, 2, \dots, NB

, holds the index in arrays INDB, IPIVP, IPIVQ and IDIAG defining block

b

ISTB (NB + 1)

holds the sum of the number of rows in all blocks plus

ISTB (1)

Constraint:

ISTB (1) \geq 1, ISTB (b) < ISTB (b + 1)

, for

b = 1, 2, \dots, NB

9: INDB(LINDB) – INTEGER arrayInput

On entry: INDB must hold the row indices appearing in each diagonal block, stored consecutively. Thus the elements

INDB (ISTB (b))

INDB (ISTB (b + 1) - 1)

are the row indices in the

b

th block.

Constraint:

1 \leq INDB (m) \leq N

, for

m = 1, 2, \dots, ISTB (NB + 1) - 1

10: LINDB – INTEGERInput

On entry: the dimension of the arrays INDB, IPIVP, IPIVQ and IDIAG as declared in the (sub)program from which F11DTF is called.

Constraint:

LINDB \geq ISTB (NB + 1) - 1

11: LFILL(NB) – INTEGER arrayInput

On entry: if

LFILL (b) \geq 0

its value is the maximum level of fill allowed in the decomposition of the block

b

(see Section 8.2 in F11DNF). A negative value of

LFILL (b)

indicates that

DTOL (b)

will be used to control the fill in block

b

instead.

12: DTOL(NB) – REAL (KIND=nag_wp) arrayInput

On entry: if

LFILL (b) < 0

then

DTOL (b)

is used as a drop tolerance in block

b

to control the fill-in (see Section 8.2 in F11DNF); otherwise

DTOL (b)

is not referenced.

Constraint: if

LFILL (b) < 0

DTOL (b) \geq 0.0

, for

b = 1, 2, \dots, NB

13: PSTRAT(NB) – CHARACTER(1) arrayInput

On entry:

PSTRAT (b)

, for

b = 1, 2, \dots, NB

, specifies the pivoting strategy to be adopted in block

b

as follows:

$PSTRAT (b) ='N'$: No pivoting is carried out.
$PSTRAT (b) ='U'$: Pivoting is carried out according to the user-defined input values of IPIVP and IPIVQ.
$PSTRAT (b) ='P'$: Partial pivoting by columns for stability is carried out.
$PSTRAT (b) ='C'$: Complete pivoting by rows for sparsity, and by columns for stability, is carried out.

Suggested value:

PSTRAT (b) ='C'

, for

b = 1, 2, \dots, NB

Constraint:

PSTRAT (b) ='N'

'U'

'P'

'C'

, for

b = 1, 2, \dots, NB

14: MILU(NB) – CHARACTER(1) arrayInput

On entry:

MILU (b)

, for

b = 1, 2, \dots, NB

, indicates whether or not the factorization in block

b

should be modified to preserve row-sums (see Section 8.4 in F11DNF).

$MILU (b) ='M'$: The factorization is modified.
$MILU (b) ='N'$: The factorization is not modified.

Constraint:

MILU (b) ='M'

'N'

, for

b = 1, 2, \dots, NB

15: IPIVP(LINDB) – INTEGER arrayInput/Output

16: IPIVQ(LINDB) – INTEGER arrayInput/Output

On entry: if

PSTRAT (b) ='U'

, then

IPIVP (ISTB (b) + k - 1)

and

IPIVQ (ISTB (b) + k - 1)

must specify the row and column indices of the element used as a pivot at elimination stage

k

of the factorization of block

b

. Otherwise IPIVP and IPIVQ need not be initialized.

Constraint: if

PSTRAT (b) ='U'

, the elements

ISTB (b)

ISTB (b + 1) - 1

of IPIVP and IPIVQ must both hold valid permutations of the integers on

[1, ISTB (b + 1) - ISTB (b)]

On exit: the row and column indices of the pivot elements, arranged consecutively for each block, as for INDB. If

IPIVP (ISTB (b) + k - 1) = i

and

IPIVQ (ISTB (b) + k - 1) = j

, then the element in row

i

and column

j

A_{b}

was used as the pivot at elimination stage

k

17: ISTR( $LINDB + 1$ ) – INTEGER arrayOutput

On exit:

ISTR (ISTB (b) + k - 1)

, gives the starting address in the arrays A, IROW and ICOL of row

k

of the matrix

C_{b}

, for

b = 1, 2, \dots, NB

and

k = 1, 2, \dots, ISTB (b + 1) - ISTB (b)

ISTR (ISTB (NB + 1))

contains

NNZ + NNZC + 1

18: IDIAG(LINDB) – INTEGER arrayOutput

On exit:

IDIAG (ISTB (b) + k - 1)

, gives the address in the arrays A, IROW and ICOL of the diagonal element in row

k

of the matrix

C_{b}

, for

b = 1, 2, \dots, NB

and

k = 1, 2, \dots, ISTB (b + 1) - ISTB (b)

19: NNZC – INTEGEROutput

On exit: the sum total number of nonzero elements in the matrices

C_{b}

, for

b = 1, 2, \dots, NB

20: NPIVM(NB) – INTEGER arrayOutput

On exit: if

NPIVM (b) > 0

it gives the number of pivots which were modified during the factorization to ensure that

M_{b}

exists.

NPIVM (b) = - 1

no pivot modifications were required, but a local restart occurred (see Section 8.3 in F11DNF). The quality of the preconditioner will generally depend on the returned values of

NPIVM (b)

, for

b = 1, 2, \dots, NB

NPIVM (b)

is large, for some

b

, the preconditioner may not be satisfactory. In this case it may be advantageous to call F11DTF again with an increased value of

LFILL (b)

, a reduced value of

DTOL (b)

, or

PSTRAT (b) ='C'

21: IWORK(LIWORK) – INTEGER arrayWorkspace

22: LIWORK – INTEGERInput

On entry: the dimension of the array IWORK as declared in the (sub)program from which F11DTF is called.

Constraint:

LIWORK \geq 9 \times N + 3

23: IFAIL – INTEGERInput/Output

On entry: IFAIL must be set to

0

- 1 ​ or ​ 1

. If you are unfamiliar with this parameter you should refer to Section 3.3 in the Essential Introduction for details.

For environments where it might be inappropriate to halt program execution when an error is detected, the value

- 1 ​ or ​ 1

is recommended. If the output of error messages is undesirable, then the value

1

is recommended. Otherwise, if you are not familiar with this parameter, the recommended value is

0

. When the value $- 1 or 1$ is used it is essential to test the value of IFAIL on exit.

On exit:

IFAIL = 0

unless the routine detects an error or a warning has been flagged (see Section 6).

6 Error Indicators and Warnings

If on entry

IFAIL = 0

- 1

, explanatory error messages are output on the current error message unit (as defined by X04AAF).

Errors or warnings detected by the routine:

$IFAIL = 1$: On entry, $DTOL (⟨value⟩) = ⟨value⟩$ .
Constraint: $DTOL (b) \geq 0.0$ for all $b$ .

On entry, for $b = ⟨value⟩$ , $ISTB (b + 1) = ⟨value⟩$ and $ISTB (b) = ⟨value⟩$ .
Constraint: $ISTB (b + 1) > ISTB (b)$ for all $b$ .

On entry, $INDB (⟨value⟩) = ⟨value⟩$ and $N = ⟨value⟩$ .
Constraint: $1 \leq INDB (b) \leq N$ for all $b$ .

On entry, $ISTB (1) = ⟨value⟩$ .
Constraint: $ISTB (1) \geq 1$ .

On entry, $LA = ⟨value⟩$ and $NNZ = ⟨value⟩$ .
Constraint: $LA \geq 2 \times NNZ$ .

On entry, $LINDB = ⟨value⟩$ , $ISTB (NB + 1) - 1 = ⟨value⟩$ and $NB = ⟨value⟩$ .
Constraint: $LINDB \geq ISTB (NB + 1) - 1$ .

On entry, $LIWORK = ⟨value⟩$ .
Constraint: $LIWORK \geq ⟨value⟩$ .

On entry, $MILU (⟨value⟩) = "⟨value⟩"$ .
Constraint: $MILU (b) ='M'$ or $'N'$ for all $b$ .

On entry, $N = ⟨value⟩$ .
Constraint: $N \geq 1$ .

On entry, $NB = ⟨value⟩$ and $N = ⟨value⟩$ .
Constraint: $1 \leq NB \leq N$ .

On entry, $NNZ = ⟨value⟩$ .
Constraint: $NNZ \geq 1$ .

On entry, $NNZ = ⟨value⟩$ and $N = ⟨value⟩$ .
Constraint: $NNZ \leq N^{2}$ .

On entry, $PSTRAT (⟨value⟩) = "⟨value⟩"$ .
Constraint: $PSTRAT (b) ='N'$ , $'U'$ , $'P'$ or $'C'$ for all $b$ .

$IFAIL = 2$: On entry, element $⟨value⟩$ of A was out of order.

On entry, $ICOL (⟨value⟩) = ⟨value⟩$ and $N = ⟨value⟩$ .
Constraint: $1 \leq ICOL (j) \leq N$ for all $j$ .

On entry, $IROW (⟨value⟩) = ⟨value⟩$ and $N = ⟨value⟩$ .
Constraint: $1 \leq IROW (i) \leq N$ for all $i$ .

On entry, location $⟨value⟩$ of $(IROW, ICOL)$ was a duplicate.

$IFAIL = 3$: On entry, the user-supplied value of IPIVP for block $⟨value⟩$ lies outside the range $[1, N]$ .

On entry, the user-supplied value of IPIVP for block $⟨value⟩$ was repeated.

On entry, the user-supplied value of IPIVQ for block $⟨value⟩$ lies outside the range $[1, N]$ .

On entry, the user-supplied value of IPIVQ for block $⟨value⟩$ was repeated.

$IFAIL = 4$: The number of nonzero entries in the decomposition is too large.
The decomposition has been terminated before completion.
Either increase LA, or reduce the fill by reducing LFILL, or increasing DTOL.

7 Accuracy

The accuracy of the factorization of each block

A_{b}

will be determined by the size of the elements that are dropped and the size of any modifications made to the pivot elements. If these sizes are small then the computed factors will correspond to a matrix close to

A_{b}

. The factorization can generally be made more accurate by increasing the level of fill

LFILL (b)

, or by reducing the drop tolerance

DTOL (b)

with

LFILL (b) < 0

If F11DTF is used in combination with F11BSF or F11DUF, the more accurate the factorization the fewer iterations will be required. However, the cost of the decomposition will also generally increase.

8 Further Comments

F11DTF calls F11DNF internally for each block

A_{b}

. The comments and advice provided in Section 8 in F11DNF on timing, control of fill, algorithmic details, and choice of parameters, are all therefore relevant to F11DTF, if interpreted blockwise.

9 Example

This example program reads in a sparse matrix

A

and then defines a block partitioning of the row indices with a user-supplied overlap and computes an overlapping incomplete

L U

factorization suitable for use as an additive Schwarz preconditioner. Such a factorization is used for this purpose in the example program of F11DUF.

NAG Library Routine DocumentF11DTF

+− Contents

1 Purpose

2 Specification

3 Description

4 References

5 Parameters

6 Error Indicators and Warnings

7 Accuracy

8 Further Comments

9 Example

9.1 Program Text

9.2 Program Data

9.3 Program Results

NAG Library Routine Document

F11DTF