NAG FL Interface
f01brf (real_gen_sparse_lu)
1
Purpose
f01brf factorizes a real sparse matrix. The routine either forms the factorization of a permutation of the entire matrix, or, optionally, first permutes the matrix to block lower triangular form and then only factorizes the diagonal blocks.
2
Specification
Fortran Interface
Subroutine f01brf ( |
n, nz, a, licn, irn, lirn, icn, pivot, ikeep, iw, w, lblock, grow, abort, idisp, ifail) |
Integer, Intent (In) |
:: |
n, nz, licn, lirn |
Integer, Intent (Inout) |
:: |
irn(lirn), icn(licn), ifail |
Integer, Intent (Out) |
:: |
ikeep(5*n), iw(8*n), idisp(10) |
Real (Kind=nag_wp), Intent (In) |
:: |
pivot |
Real (Kind=nag_wp), Intent (Inout) |
:: |
a(licn) |
Real (Kind=nag_wp), Intent (Out) |
:: |
w(n) |
Logical, Intent (In) |
:: |
lblock, grow, abort(4) |
|
C Header Interface
#include <nag.h>
void |
f01brf_ (const Integer *n, const Integer *nz, double a[], const Integer *licn, Integer irn[], const Integer *lirn, Integer icn[], const double *pivot, Integer ikeep[], Integer iw[], double w[], const logical *lblock, const logical *grow, const logical abort[], Integer idisp[], Integer *ifail) |
|
C++ Header Interface
#include <nag.h> extern "C" {
void |
f01brf_ (const Integer &n, const Integer &nz, double a[], const Integer &licn, Integer irn[], const Integer &lirn, Integer icn[], const double &pivot, Integer ikeep[], Integer iw[], double w[], const logical &lblock, const logical &grow, const logical abort[], Integer idisp[], Integer &ifail) |
}
|
The routine may be called by the names f01brf or nagf_matop_real_gen_sparse_lu.
3
Description
Given a real sparse matrix
,
f01brf may be used to obtain the
factorization of a permutation of
,
where
and
are permutation matrices,
is unit lower triangular and
is upper triangular. The routine uses a sparse variant of Gaussian elimination, and the pivotal strategy is designed to compromise between maintaining sparsity and controlling loss of accuracy through round-off.
Optionally the routine first permutes the matrix into block lower triangular form and then only factorizes the diagonal blocks. For some matrices this gives a considerable saving in storage and execution time.
Extensive data checks are made; duplicated nonzeros can be accumulated.
The factorization is intended to be used by
f04axf to solve sparse systems of linear equations
or
. If several matrices of the same sparsity pattern are to be factorized,
f01bsf should be used for the second and subsequent matrices.
The method is fully described in
Duff (1977).
A more recent algorithm for the same calculation is provided by
f11mef.
4
References
Duff I S (1977) MA28 – a set of Fortran subroutines for sparse unsymmetric linear equations AERE Report R8730 HMSO
5
Arguments
-
1:
– Integer
Input
-
On entry: , the order of the matrix .
Constraint:
.
-
2:
– Integer
Input
-
On entry: the number of nonzero elements in the matrix .
Constraint:
.
-
3:
– Real (Kind=nag_wp) array
Input/Output
-
On entry: , for , must contain the nonzero elements of the sparse matrix . They can be in any order since f01brf will reorder them.
On exit: the nonzero elements in the
factorization. The array must
not be changed by you between a call of
f01brf and a call of
f04axf.
-
4:
– Integer
Input
-
On entry: the dimension of the arrays
a and
icn as declared in the (sub)program from which
f01brf is called. Since the factorization is returned in
a and
icn,
licn should be large enough to accommodate this and should ordinarily be
to
times as large as
nz.
Constraint:
.
-
5:
– Integer array
Input/Output
-
On entry: , for , must contain the row index of the nonzero element stored in .
On exit:
irn is overwritten and is not needed for subsequent calls of
f01bsf or
f04axf.
-
6:
– Integer
Input
-
On entry: the dimension of the array
irn as declared in the (sub)program from which
f01brf is called. It need not be as large as
licn; normally it will not need to be very much greater than
nz.
Constraint:
.
-
7:
– Integer array
Communication Array
-
On entry: , for , must contain the column index of the nonzero element stored in .
On exit:
icn contains the column indices of the nonzero elements in the factorization. The array must
not be changed by you between a call of
f01brf and subsequent calls of
f01bsf or
f04axf.
-
8:
– Real (Kind=nag_wp)
Input
-
On entry: should have a value in the range
and is used to control the choice of pivots. If
, the value
is assumed, and if
, the value
is assumed. When searching a row for a pivot, any element is excluded which is less than
pivot times the largest of those elements in the row available as pivots. Thus decreasing
pivot biases the algorithm to maintaining sparsity at the expense of stability.
Suggested value:
has been found to work well on test examples.
-
9:
– Integer array
Communication Array
-
On exit: indexing information about the factorization.
You must
not change
ikeep between a call of
f01brf and subsequent calls to
f01bsf or
f04axf.
-
10:
– Integer array
Workspace
-
-
11:
– Real (Kind=nag_wp) array
Output
-
On exit: if
,
contains an estimate (an upper bound) of the increase in size of elements encountered during the factorization (see
grow); the rest of the array is used as workspace.
If , the array is not used.
-
12:
– Logical
Input
-
On entry: if , the matrix is preordered into block lower triangular form before the factorization is performed; otherwise the entire matrix is factorized.
Suggested value:
unless the matrix is known to be irreducible, or is singular and an upper bound on the rank is required.
-
13:
– Logical
Input
-
On entry: if
, then on exit
contains an estimate (an upper bound) of the increase in size of elements encountered during the factorization. If the matrix is well-scaled (see
Section 9.2), then a high value for
indicates that the
factorization may be inaccurate and you should be wary of the results and perhaps increase the argument
pivot for subsequent runs (see
Section 7).
Suggested value:
.
-
14:
– Logical array
Input
-
On entry: if
,
f01brf will exit immediately on detecting a structural singularity (one that depends on the pattern of nonzeros) and return
; otherwise it will complete the factorization (see
Section 9.3).
If
,
f01brf will exit immediately on detecting a numerical singularity (one that depends on the numerical values) and return
; otherwise it will complete the factorization (see
Section 9.3).
If
,
f01brf will exit immediately (with
) when the arrays
a and
icn are filled up by the previously factorized, active and unfactorized parts of the matrix; otherwise it continues so that better guidance on necessary array sizes can be given in
and
, and will exit with
ifail in the range
to
. Note that there is always an immediate error exit if the array
irn is too small.
If , f01brf exits immediately (with ) if it finds duplicate elements in the input matrix.
If
,
f01brf proceeds using a value equal to the sum of the duplicate elements. In either case details of each duplicate element are output on the current advisory message unit (see
x04abf), unless suppressed by the value of
ifail on entry.
Suggested values:
- ;
- ;
- ;
- .
-
15:
– Integer array
Communication Array
-
On exit: contains information about the factorization.
and
indicate the position in arrays
a and
icn of the first and last elements in the
factorization of the diagonal blocks. (
gives the number of nonzeros in the factorization.)
and
must not be changed by you between a call of
f01brf and subsequent calls to
f01bsf or
f04axf.
and
monitor the adequacy of ‘elbow room’ in the arrays
irn and
a (and
icn) respectively, by giving the number of times that the data in these arrays has been compressed during the factorization to release more storage. If either
or
is quite large (say greater than
), it will probably pay you to increase the size of the corresponding array(s) for subsequent runs. If either is very low or zero, then you can perhaps save storage by reducing the size of the corresponding array(s).
, when , gives an upper bound on the rank of the matrix; when , gives an upper bound on the sum of the ranks of the lower triangular blocks.
and
give the minimum size of arrays
irn and
a (and
icn) respectively which would enable a successful run on an identical matrix (but some ‘elbow-room’ should be allowed – see
Section 9).
to
are only used if
.
- gives the structural rank of the matrix.
- gives the number of diagonal blocks.
- gives the size of the largest diagonal block.
You must
not change
idisp between a call of
f01brf and subsequent calls to
f01bsf or
f04axf.
-
16:
– Integer
Input/Output
-
This routine uses an
ifail input value codification that differs from the normal case to distinguish between errors and warnings (see
Section 4 in the Introduction to the NAG Library FL Interface).
On entry:
ifail must be set to one of the values below to set behaviour on detection of an error; these values have no effect when no error is detected. The behaviour relate to whether or not program execution is halted and whether or not messages are printed when an error or warning is detected.
ifail |
Execution |
Error Printing |
Warning Printed |
|
halted |
No |
No |
|
continue |
No |
No |
|
halted |
Yes |
No |
|
continue |
Yes |
No |
|
halted |
No |
Yes |
|
continue |
No |
Yes |
|
halted |
Yes |
Yes |
|
continue |
Yes |
Yes |
For environments where it might be inappropriate to halt program execution when an error is detected, the value
,
,
or
is recommended. If the printing of messages is undesirable, then the value
is recommended. Otherwise, the recommended value is
.
When the value , , or is used it is essential to test the value of ifail on exit.
On exit:
unless the routine detects an error or a warning has been flagged (see
Section 6).
6
Error Indicators and Warnings
If on entry
or
, explanatory error messages are output on the current error message unit (as defined by
x04aaf).
Errors or warnings detected by the routine:
-
Matrix is structurally singular – decomposition aborted.
Matrix is structurally singular – decomposition aborted .
-
Matrix is numerically singular – decomposition aborted.
-
lirn too small. Decomposition aborted at stage
in block
with first row
and last row
.
lirn too small. Decomposition aborted at stage
in block
with first row
and last row
. To continue set
lirn to at least
.
-
licn much too small. Decomposition aborted at stage
in block
with first row
and last row
.
licn much too small. Decomposition aborted at stage
in block
with first row
and last row
. To continue set
licn to at least
.
-
licn too small. Decomposition aborted at stage
in block
with first row
and last row
.
licn too small. Decomposition aborted at stage
in block
with first row
and last row
. To continue set
licn to at least
.
licn too small. For successful decomposition set
licn to at least
.
-
licn and
lirn too small. Decomposition aborted at stage
in block
with first row
and last row
.
licn and
lirn too small. Decomposition aborted at stage
in block
with first row
and last row
. To continue set
lirn to at least
.
-
licn not big enough for permutation – increase by
.
-
On entry, .
Constraint: .
-
On entry, .
Constraint: .
-
On entry, and .
Constraint: .
-
On entry, and .
Constraint: .
-
On entry, or is out of range: , , .
-
On entry, duplicate elements found – see advisory messages.
-
Matrix is structurally singular – decomposition completed.
-
Matrix is numerically singular – decomposition completed.
An unexpected error has been triggered by this routine. Please
contact
NAG.
See
Section 7 in the Introduction to the NAG Library FL Interface for further information.
Your licence key may have expired or may not have been installed correctly.
See
Section 8 in the Introduction to the NAG Library FL Interface for further information.
Dynamic memory allocation failed.
See
Section 9 in the Introduction to the NAG Library FL Interface for further information.
7
Accuracy
The factorization obtained is exact for a perturbed matrix whose th element differs from by less than where is the machine precision, is the growth value returned in if , and the number of Gaussian elimination operations applied to element . The value of is not greater than and is usually much less. Small values therefore guarantee accurate results, but unfortunately large values may give a very pessimistic indication of accuracy.
8
Parallelism and Performance
f01brf is not threaded in any implementation.
The time required may be estimated very roughly from the number
of nonzeros in the factorized form (output as
) and for
f01brf and its associates is
where our unit is the time for the inner loop of a full matrix code (e.g., solving a full set of equations takes about
units). Note that the faster
f01bsf time makes it well worthwhile to use this for a sequence of problems with the same pattern.
It should be appreciated that
varies widely from problem to problem. For network problems it may be little greater than
nz, the number of nonzeros in
; for discretization of two-dimensional and three-dimensional partial differential equations it may be about
and
, respectively.
The time taken by f01brf to find the block lower triangular form () is typically of the time taken by the routine when it is not found (). If the matrix is irreducible ( after a call with ) then this time is wasted. Otherwise, particularly if the largest block is small (), the consequent savings are likely to be greater.
The time taken to estimate growth () is typically under of the overall time.
The overall time may be substantially increased if there is inadequate ‘elbow-room’ in the arrays
a,
irn and
icn. When the sizes of the arrays are minimal (
and
) it can execute as much as three times slower. Values of
and
greater than about
indicate that it may be worthwhile to increase array sizes.
The use of a relative pivot tolerance
pivot essentially presupposes that the matrix is well-scaled, i.e., that the matrix elements are broadly comparable in size. Practical problems are often naturally well-scaled but particular care is needed for problems containing mixed types of variables (for example millimetres and neutron fluxes).
It is envisaged that
f01brf will almost always be called for square nonsingular matrices and that singularity indicates an error condition. However, even if the matrix is singular it is possible to complete the factorization. It is even possible for
f04axf to solve a set of equations whose matrix is singular provided the set is consistent.
Two forms of singularity are possible. If the matrix would be singular for any values of the nonzeros (e.g., if it has a whole row of zeros), then we say it is structurally singular, and continue only if . If the matrix is nonsingular by virtue of the particular values of the nonzeros, then we say that it is numerically singular and continue only if , in which case an upper bound on the rank of the matrix is returned in when .
Rectangular matrices may be treated by setting
n to the larger of the number of rows and numbers of columns and setting
.
Note: the soft
failure option should be used (last digit of ) if you wish to factorize singular matrices with or set to .FALSE..
The matrix may consist of a sum of contributions from different sub-systems (for example finite elements). In such cases you may rely on f01brf to perform assembly, since duplicated elements are summed.
To compute the determinant of
after a call of
f01brf, the following
code may be used to
det = 1.0
id = idisp(1)
Do 10 i = 1, n
idg = id + ikeep(3*n+i)
det = det*a(idg)
If (ikeep(n+i).ne.i)det = -det
If (ikeep(2*n+i).ne.i)det = -det
id = id + ikeep(i)
10 Continue
the real variable
det will then hold the determinant.
10
Example
This example factorizes the real sparse matrix:
This example program simply prints out some information about the factorization as returned by
f01brf in
and
idisp. Normally the call of
f01brf would be followed by a call of
f04axf (see
Section 10 in
f04axf).
10.1
Program Text
10.2
Program Data
10.3
Program Results