PDF version (NAG web site
, 64-bit version, 64-bit version)
NAG Toolbox: nag_sparse_real_gen_precon_bdilu (f11df)
Purpose
nag_sparse_real_gen_precon_bdilu (f11df) computes a block diagonal incomplete
factorization of a real sparse nonsymmetric matrix, represented in coordinate storage format. The diagonal blocks may be composed of arbitrary rows and the corresponding columns, and may overlap. This factorization can be used to provide a block Jacobi or additive Schwarz preconditioner, for use in combination with
nag_sparse_real_gen_basic_solver (f11be) or
nag_sparse_real_gen_solve_bdilu (f11dg).
Syntax
[
a,
irow,
icol,
ipivp,
ipivq,
istr,
idiag,
nnzc,
npivm,
ifail] = f11df(
n,
nz,
a,
irow,
icol,
istb,
indb,
lfill,
dtol,
milu,
ipivp,
ipivq, 'la',
la, 'nb',
nb, 'lindb',
lindb, 'pstrat',
pstrat)
[
a,
irow,
icol,
ipivp,
ipivq,
istr,
idiag,
nnzc,
npivm,
ifail] = nag_sparse_real_gen_precon_bdilu(
n,
nz,
a,
irow,
icol,
istb,
indb,
lfill,
dtol,
milu,
ipivp,
ipivq, 'la',
la, 'nb',
nb, 'lindb',
lindb, 'pstrat',
pstrat)
Description
nag_sparse_real_gen_precon_bdilu (f11df) computes an incomplete
factorization (see
Meijerink and Van der Vorst (1977) and
Meijerink and Van der Vorst (1981)) of the (possibly overlapping)
diagonal blocks
, for
, of a real sparse nonsymmetric
by
matrix
. The factorization is intended primarily for use as a block Jacobi or additive Schwarz preconditioner (see
Saad (1996)), with one of the iterative solvers
nag_sparse_real_gen_basic_solver (f11be) and
nag_sparse_real_gen_solve_bdilu (f11dg).
The
nb diagonal blocks need not consist of consecutive rows and columns of
, but may be composed of arbitrarily indexed rows, and the corresponding columns, as defined in the arguments
indb and
istb. Any given row or column index may appear in more than one diagonal block, resulting in overlap. Each diagonal block
, for
, is factorized as:
where
and
is lower triangular with unit diagonal elements,
is diagonal,
is upper triangular with unit diagonals,
and
are permutation matrices, and
is a remainder matrix.
The amount of fill-in occurring in the factorization of block can vary from zero to complete fill, and can be controlled by specifying either the maximum level of fill , or the drop tolerance .
The parameter defines the pivoting strategy to be used in block . The options currently available are no pivoting, user-defined pivoting, partial pivoting by columns for stability, and complete pivoting by rows for sparsity and by columns for stability. The factorization may optionally be modified to preserve the row-sums of the original block matrix.
The sparse matrix
is represented in coordinate storage (CS) format (see
Coordinate storage (CS) format in the F11 Chapter Introduction). The array
a stores all the nonzero elements of the matrix
, while arrays
irow and
icol store the corresponding row and column indices respectively. Multiple nonzero elements may not be specified for the same row and column index.
The preconditioning matrices
, for
, are returned in terms of the CS representations of the matrices
References
Meijerink J and Van der Vorst H (1977) An iterative solution method for linear systems of which the coefficient matrix is a symmetric M-matrix Math. Comput. 31 148–162
Meijerink J and Van der Vorst H (1981) Guidelines for the usage of incomplete decompositions in solving sets of linear equations as they occur in practical problems J. Comput. Phys. 44 134–155
Saad Y (1996) Iterative Methods for Sparse Linear Systems PWS Publishing Company, Boston, MA
Parameters
Compulsory Input Parameters
- 1:
– int64int32nag_int scalar
-
, the order of the matrix .
Constraint:
.
- 2:
– int64int32nag_int scalar
-
The number of nonzero elements in the matrix .
Constraint:
.
- 3:
– double array
-
The nonzero elements in the matrix
, ordered by increasing row index, and by increasing column index within each row. Multiple entries for the same row and column indices are not permitted. The function
nag_sparse_real_gen_sort (f11za) may be used to order the elements in this way.
- 4:
– int64int32nag_int array
- 5:
– int64int32nag_int array
-
The row and column indices of the nonzero elements supplied in
a.
Constraints:
irow and
icol must satisfy these constraints (which may be imposed by a call to
nag_sparse_real_gen_sort (f11za)):
- and , for ;
- either or both and , for .
- 6:
– int64int32nag_int array
-
, for
, holds the indices in arrays
indb,
ipivp,
ipivq and
idiag that, on successful exit from this function, define block
.
holds the sum of the number of rows in all blocks plus
.
Constraint:
, for .
- 7:
– int64int32nag_int array
-
indb must hold the row indices appearing in each diagonal block, stored consecutively. Thus the elements
to
are the row indices in the
th block, for
.
Constraint:
, for .
- 8:
– int64int32nag_int array
-
If
its value is the maximum level of fill allowed in the decomposition of the block (see
Control of Fill-in in
nag_sparse_real_gen_precon_ilu (f11da)). A negative value of
indicates that
will be used to control the fill in the block instead.
- 9:
– double array
-
If
then
is used as a drop tolerance in the block to control the fill-in (see
Control of Fill-in in
nag_sparse_real_gen_precon_ilu (f11da)); otherwise
is not referenced.
Constraint:
if , , for .
- 10:
– cell array of strings
-
, for
, indicates whether or not the factorization in the block should be modified to preserve row-sums (see
Choice of s in
nag_sparse_real_gen_precon_ilu (f11da)).
- The factorization is modified.
- The factorization is not modified.
Constraint:
or , for .
- 11:
– int64int32nag_int array
- 12:
– int64int32nag_int array
-
If
, then
and
must specify the row and column indices of the element used as a pivot at elimination stage
of the factorization of the block. Otherwise
ipivp and
ipivq need not be initialized.
Constraint:
if
, the elements
to
of
ipivp and
ipivq must both hold valid permutations of the integers on
.
Optional Input Parameters
- 1:
– int64int32nag_int scalar
-
Default:
the dimension of the arrays
a,
irow,
icol. (An error is raised if these dimensions are not equal.)
The dimension of the arrays
a,
irow and
icol. these arrays must be of sufficient size to store both
(
nz elements) and
(
nnzc elements).
Note: the minimum value for
la is only appropriate if
lfill and
dtol are set such that minimal fill-in occurs. If this is not the case then we recommend that
la is set much larger than the minimum value indicated in the constraint.
Constraint:
.
- 2:
– int64int32nag_int scalar
-
Default:
the dimension of the arrays
lfill,
dtol,
pstrat,
milu. (An error is raised if these dimensions are not equal.)
The number of diagonal blocks to factorize.
Constraint:
.
- 3:
– int64int32nag_int scalar
-
Default:
the dimension of the arrays
indb,
ipivp,
ipivq. (An error is raised if these dimensions are not equal.)
The dimension of the arrays
indb,
ipivp,
ipivq and
idiag.
Constraint:
.
- 4:
– cell array of strings
Suggested value:
, for .
Default:
, for
, specifies the pivoting strategy to be adopted in the block as follows:
- No pivoting is carried out.
- Pivoting is carried out according to the user-defined input values of ipivp and ipivq.
- Partial pivoting by columns for stability is carried out.
- Complete pivoting by rows for sparsity, and by columns for stability, is carried out.
Constraint:
, , or , for .
Output Parameters
- 1:
– double array
-
The first
nz entries of
a contain the nonzero elements of
and the next
nnzc entries contain the elements of the matrices
, for
stored consecutively. Within each block the matrix elements are ordered by increasing row index, and by increasing column index within each row.
- 2:
– int64int32nag_int array
- 3:
– int64int32nag_int array
-
The row and column indices of the nonzero elements returned in
a.
- 4:
– int64int32nag_int array
- 5:
– int64int32nag_int array
-
The row and column indices of the pivot elements, arranged consecutively for each block, as for
indb. If
and
, then the element in row
and column
of
was used as the pivot at elimination stage
.
- 6:
– int64int32nag_int array
-
, gives the index in the arrays
a,
irow and
icol of row
of the matrix
, for
and
.
contains .
- 7:
– int64int32nag_int array
-
, gives the index in the arrays
a,
irow and
icol of the diagonal element in row
of the matrix
, for
and
.
- 8:
– int64int32nag_int scalar
-
The sum total number of nonzero elements in the matrices
, for .
- 9:
– int64int32nag_int array
-
If
it gives the number of pivots which were modified during the factorization to ensure that
exists.
If
no pivot modifications were required, but a local restart occurred (see
Algorithmic Details in
nag_sparse_real_gen_precon_ilu (f11da)). The quality of the preconditioner will generally depend on the returned values of
, for
.
If is large, for some block, the preconditioner may not be satisfactory. In this case it may be advantageous to call nag_sparse_real_gen_precon_bdilu (f11df) again with an increased value of , a reduced value of , or .
- 10:
– int64int32nag_int scalar
unless the function detects an error (see
Error Indicators and Warnings).
Error Indicators and Warnings
Errors or warnings detected by the function:
-
-
Constraint: .
Constraint: , for .
Constraint: , for .
Constraint: .
Constraint: .
Constraint: .
Constraint: , for
Constraint: or for all .
Constraint: .
Constraint: .
Constraint: .
Constraint: , , or for all .
liwork is too small.
-
-
Constraint: , for .
Constraint: , for .
On entry, element
of
a was out of order.
On entry, location of was a duplicate.
-
-
On entry, the user-supplied value of
ipivp for block
lies outside its range.
On entry, the user-supplied value of
ipivp for block
was repeated.
On entry, the user-supplied value of
ipivq for block
lies outside its range.
On entry, the user-supplied value of
ipivq for block
was repeated.
-
-
The number of nonzero entries in the decomposition is too large.
The decomposition has been terminated before completion.
Either increase
la, or reduce the fill by reducing
lfill, or increasing
dtol.
-
An unexpected error has been triggered by this routine. Please
contact
NAG.
-
Your licence key may have expired or may not have been installed correctly.
-
Dynamic memory allocation failed.
Accuracy
The accuracy of the factorization of each block will be determined by the size of the elements that are dropped and the size of any modifications made to the pivot elements. If these sizes are small then the computed factors will correspond to a matrix close to . The factorization can generally be made more accurate by increasing the level of fill , or by reducing the drop tolerance with .
If
nag_sparse_real_gen_precon_bdilu (f11df) is used in combination with
nag_sparse_real_gen_basic_solver (f11be) or
nag_sparse_real_gen_solve_bdilu (f11dg), the more accurate the factorization the fewer iterations will be required. However, the cost of the decomposition will also generally increase.
Further Comments
nag_sparse_real_gen_precon_bdilu (f11df) calls
nag_sparse_real_gen_precon_ilu (f11da)
internally for each block
. The comments and advice provided in
Further Comments in
nag_sparse_real_gen_precon_ilu (f11da) on timing, control of
fill, algorithmic details, and choice of parameters, are all
therefore relevant to
nag_sparse_real_gen_precon_bdilu (f11df), if interpreted blockwise.
Example
This example program reads in a sparse matrix
and then defines a block partitioning of the row indices with a user-supplied overlap and computes an overlapping incomplete
factorization suitable for use as an additive Schwarz preconditioner. Such a factorization is used for this purpose in the example program of
nag_sparse_real_gen_solve_bdilu (f11dg).
Open in the MATLAB editor:
f11df_example
function f11df_example
fprintf('f11df example results\n\n');
n = int64(9);
nz = int64(33);
a = zeros(20*nz, 1);
irow = zeros(20*nz, 1, 'int64');
icol = zeros(20*nz, 1, 'int64');
a(1:nz) = [64; -20; -20; -12; 64; -20; -20; -12; 64; -20; -12;
64; -20; -20; -12; -12; 64; -20; -20; -12; -12; 64;
-20; -12; 64; -20; -12; -12; 64; -20; -12; -12; 64];
irow(1:nz) = [ 1; 1; 1; 2; 2; 2; 2; 3; 3; 3; 4;
4; 4; 4; 5; 5; 5; 5; 5; 6; 6; 6;
6; 7; 7; 7; 8; 8; 8; 8; 9; 9; 9];
icol(1:nz) = [ 1; 2; 4; 1; 2; 3; 5; 2; 3; 6; 1;
4; 5; 7; 2; 4; 5; 6; 8; 3; 5; 6;
9; 4; 7; 8; 5; 7; 8; 9; 6; 8; 9];
nb = int64(3);
nover = 1;
lfill = [int64(0); 0; 0];
dtol = [ 0; 0; 0];
pstrat = {'n'; 'n'; 'n'};
milu = {'n'; 'n'; 'n'};
mb = idivide(n+nb-1, nb);
istb = zeros(nb+1, 1, 'int64');
indb = zeros(3*n, 1, 'int64');
ipivp = zeros(3*n, 1, 'int64');
ipivq = zeros(3*n, 1, 'int64');
istb(1:nb) = [1:mb:nb*mb];
istb(nb+1) = n+1;
indb(1:n) = [1:n];
[istb, indb, ifail] = f11df_overlap(n, nz, irow, icol, nb, ...
istb, indb, 3*n, nover);
if (ifail == -999)
error('indb is too small, size of indb = %d', numel(indb));
end
fprintf('\nOriginal matrix\n');
fprintf(' n = %d\n', n);
fprintf(' nz = %d\n', nz);
fprintf(' nb = %d\n', nb);
for k=1:nb
fprintf(' Block %d: order = %d, start row = %d\n', k, istb(k+1)-istb(k), ...
min(indb(istb(k):istb(k+1)-1)));
end
[a, irow, icol, ipivp, ipivq, istr, idiag, nnzc, npivm, ifail] = ...
f11df( ...
n, nz, a, irow, icol, istb, indb, ...
lfill, dtol, milu, ipivp, ipivq, 'pstrat', pstrat);
fprintf('\nFactorization\n');
fprintf(' nnzc = %d\n\n', nnzc);
fprintf(' Elements of factorization\n\n');
fprintf(' i j c(i,j) index\n');
for k=1:nb
fprintf(' C_%d --------------------------------\n', k);
for i = istr(istb(k)):istr(istb(k+1))-1
fprintf(' %4d%4d%16e%8d\n', irow(i), icol(i), a(i), i);
end
end
fprintf('\n Details of factorized blocks\n\n');
if max(npivm) > 0
fprintf(' k i istr(i) idiag(i) indb(i) ipivp(i) ipivq(i)\n');
for k=1:nb
i = istb(k);
fprintf(' %4d%4d%10d%10d%10d%10d%10d\n', k, i, istr(i), idiag(i), ...
indb(i), ipivp(i), ipivq(i));
for i = istb(k)+1:istb(k+1)-1
fprintf(' %7d%10d%10d%10d%10d%10d\n', i, istr(i), idiag(i), ...
indb(i), ipivp(i), ipivq(i));
end
fprintf(' ------------------------------------\n');
end
else
fprintf(' k i istr(i) idiag(i) indb(i)\n');
for k=1:nb
i = istb(k);
fprintf('%3d%4d%10d%10d%10d\n', k, i, istr(i), idiag(i), indb(i));
for i = istb(k)+1:istb(k+1)-1
fprintf('%7d%10d%10d%10d\n', i, istr(i), idiag(i), indb(i));
end
fprintf(' ------------------------------------\n');
end
end
function [istb, indb, ifail] = f11df_overlap(n, nz, irow, icol, nb, ...
istb, indb, lindb, nover)
ifail = 0;
iwork = zeros(3*n+1, 1, 'int64');
for k=1:nz
iwork(irow(k)) = iwork(irow(k)) + 1;
end
iwork(n+1) = 1;
for i = 1:n
iwork(n+i+1) = iwork(n+i) + iwork(i);
end
for k=1:nb
iwork(1:n) = 0;
for l = istb(k):istb(k+1)-1
iwork(indb(l)) = 1;
end
for iover=1:nover
ind = 0;
for l = istb(k):istb(k+1)-1
row = indb(l);
for i = iwork(n+row):iwork(n+row+1)-1
if (iwork(icol(i))==0)
iwork(icol(i)) = 1;
ind = ind + 1;
iwork(2*n+1+ind) = icol(i);
end
end
end
nadd = ind;
if (istb(nb+1)+nadd-1>lindb) Then
ifail = -999;
return;
end
for i = istb(nb+1) - 1:-1:istb(k+1)
indb(i+nadd) = indb(i);
end
n21 = 2*n + 1;
ik = istb(k+1) - 1;
indb(ik+1:ik+nadd) = iwork(n21+1:n21+nadd);
istb(k+1:nb+1) = istb(k+1:nb+1) + nadd;
end
end
f11df example results
Original matrix
n = 9
nz = 33
nb = 3
Block 1: order = 6, start row = 1
Block 2: order = 9, start row = 1
Block 3: order = 6, start row = 4
Factorization
nnzc = 73
Elements of factorization
i j c(i,j) index
C_1 --------------------------------
1 1 1.562500e-02 34
1 2 -3.125000e-01 35
1 4 -3.125000e-01 36
2 1 -1.875000e-01 37
2 2 1.659751e-02 38
2 3 -3.319502e-01 39
2 5 -3.319502e-01 40
3 2 -1.991701e-01 41
3 3 1.666206e-02 42
3 6 -3.332412e-01 43
4 1 -1.875000e-01 44
4 4 1.659751e-02 45
4 5 -3.319502e-01 46
5 2 -1.991701e-01 47
5 4 -1.991701e-01 48
5 5 1.784656e-02 49
5 6 -3.569313e-01 50
6 3 -1.999447e-01 51
6 5 -2.141588e-01 52
6 6 1.794754e-02 53
C_2 --------------------------------
1 1 1.562500e-02 54
1 2 -3.125000e-01 55
1 4 -1.875000e-01 56
1 5 -3.125000e-01 57
2 1 -1.875000e-01 58
2 2 1.659751e-02 59
2 3 -3.319502e-01 60
2 6 -1.991701e-01 61
2 7 -3.319502e-01 62
3 2 -1.991701e-01 63
3 3 1.666206e-02 64
3 8 -1.999447e-01 65
3 9 -3.332412e-01 66
4 1 -3.125000e-01 67
4 4 1.659751e-02 68
4 6 -3.319502e-01 69
5 1 -1.875000e-01 70
5 5 1.659751e-02 71
5 7 -3.319502e-01 72
6 2 -3.319502e-01 73
6 4 -1.991701e-01 74
6 6 1.784656e-02 75
6 8 -3.569313e-01 76
7 2 -1.991701e-01 77
7 5 -1.991701e-01 78
7 7 1.784656e-02 79
7 9 -3.569313e-01 80
8 3 -3.332412e-01 81
8 6 -2.141588e-01 82
8 8 1.794754e-02 83
9 3 -1.999447e-01 84
9 7 -2.141588e-01 85
9 9 1.794754e-02 86
C_3 --------------------------------
1 1 1.562500e-02 87
1 2 -3.125000e-01 88
1 4 -1.875000e-01 89
2 1 -1.875000e-01 90
2 2 1.659751e-02 91
2 3 -3.319502e-01 92
2 5 -1.991701e-01 93
3 2 -1.991701e-01 94
3 3 1.666206e-02 95
3 6 -1.999447e-01 96
4 1 -3.125000e-01 97
4 4 1.659751e-02 98
4 5 -3.319502e-01 99
5 2 -3.319502e-01 100
5 4 -1.991701e-01 101
5 5 1.784656e-02 102
5 6 -3.569313e-01 103
6 3 -3.332412e-01 104
6 5 -2.141588e-01 105
6 6 1.794754e-02 106
Details of factorized blocks
k i istr(i) idiag(i) indb(i)
1 1 34 34 1
2 37 38 2
3 41 42 3
4 44 45 4
5 47 49 5
6 51 53 6
------------------------------------
2 7 54 54 4
8 58 59 5
9 63 64 6
10 67 68 1
11 70 71 7
12 73 75 2
13 77 79 8
14 81 83 3
15 84 86 9
------------------------------------
3 16 87 87 7
17 90 91 8
18 94 95 9
19 97 98 4
20 100 102 5
21 104 106 6
------------------------------------
PDF version (NAG web site
, 64-bit version, 64-bit version)
© The Numerical Algorithms Group Ltd, Oxford, UK. 2009–2015