NAG FL Interface
F04 (Linsys)
Simultaneous Linear Equations

Settings help

FL Name Style:

FL Specification Language:

1 Scope of the Chapter

This chapter is concerned with the solution of the matrix equation AX=B, where B may be a single vector or a matrix of multiple right-hand sides. The matrix A may be real, complex, symmetric, Hermitian, positive definite, positive definite Toeplitz or banded. It may also be rectangular, in which case a least squares solution is obtained.
Much of the functionality of this chapter has been superseded by routines from Chapters F07 and F08 (LAPACK routines) as those chapters have grown and have included driver and expert driver routines.
For a general introduction to sparse systems of equations, see the F11 Chapter Introduction, which provides routines for large sparse systems. Some routines for sparse problems are also included in this chapter; they are described in Section 3.4.

2 Background to the Problems

A set of linear equations may be written in the form
where the known matrix A, with real or complex coefficients, is of size m×n (m rows and n columns), the known right-hand vector b has m components (m rows and one column), and the required solution vector x has n components (n rows and one column). There may also be p vectors bi, for i=1,2,,p, on the right-hand side and the equations may then be written as
the required matrix X having as its p columns the solutions of Axi=bi, for i=1,2,,p. Some routines deal with the latter case, but for clarity only the case p=1 is discussed here.
The most common problem, the determination of the unique solution of Ax=b, occurs when m=n and A is not singular, that is rank(A)=n. This is discussed in Section 2.1 below. The next most common problem, discussed in Section 2.2 below, is the determination of the least squares solution of Axb required when m>n and rank(A)=n, i.e., the determination of the vector x which minimizes the norm of the residual vector r=b-Ax. All other cases are rank deficient, and they are treated in Section 2.3.

2.1 Unique Solution of Ax=b

Most routines in this chapter solve this particular problem. The computation starts with the triangular decomposition A=PLU, where L and U are respectively lower and upper triangular matrices and P is a permutation matrix, chosen so as to ensure that the decomposition is numerically stable. The solution is then obtained by solving in succession the simpler equations
Ly = PTb Ux = y  
the first by forward-substitution and the second by back-substitution.
If A is real symmetric and positive definite, U=LT, while if A is complex Hermitian and positive definite, U=LH; in both these cases P is the identity matrix (i.e., no permutations are necessary). In all other cases either U or L has unit diagonal elements.
Due to rounding errors the computed ‘solution’ x0, say, is only an approximation to the true solution x. This approximation will sometimes be satisfactory, agreeing with x to several figures, but if the problem is ill-conditioned then x and x0 may have few or even no figures in common, and at this stage there is no means of estimating the ‘accuracy’ of x0.
There are three possible approaches to estimating the accuracy of a computed solution.
One way to do so, and to ‘correct’ x0 when this is meaningful (see next paragraph), involves computing the residual vector r=b-Ax0 in extended precision arithmetic, and obtaining a correction vector d by solving PLUd=r. The new approximate solution x0+d is usually more accurate and the correcting process is repeated until (a) further corrections are negligible or (b) they show no further decrease.
It must be emphasized that the ‘true’ solution x may not be meaningful, that is correct to all figures quoted, if the elements of A and b are known with certainty only to say p figures, where p is less than full precision. The first correction vector d will then give some useful information about the number of figures in the ‘solution’ which probably remain unchanged with respect to maximum possible uncertainties in the coefficients.
An alternative approach to assessing the accuracy of the solution is to compute or estimate the condition number of A, defined as
κ(A) = A . A-1 .  
Roughly speaking, errors or uncertainties in A or b may be amplified in the solution by a factor κ(A). Thus, for example, if the data in A and b are only accurate to 5 digits and κ(A)103, then the solution cannot be guaranteed to have more than 2 correct digits. If κ(A)105, the solution may have no meaningful digits.
To be more precise, suppose that
Ax=b  and  (A+δA)(x+δx)=b+δb.  
Here δA and δb represent perturbations to the matrices A and b which cause a perturbation δx in the solution. We can define measures of the relative sizes of the perturbations in A, b and x as
ρA=δA A ,  ρb=δb b   and  ρx=δx x   respectively.  
ρxκ (A) 1-κ (A)ρA (ρA+ρb)  
provided that κ(A)ρA<1. Often κ(A)ρA1 and then the bound effectively simplifies to
Hence, if we know κ(A), ρA and ρb, we can compute a bound on the relative errors in the solution. Note that ρA, ρb and ρx are defined in terms of the norms of A, b and x. If A, b or x contains elements of widely differing magnitude, then ρA, ρb and ρx will be dominated by the errors in the larger elements, and ρx will give no information about the relative accuracy of smaller elements of x.
A third way to obtain useful information about the accuracy of a solution is to solve two sets of equations, one with the given coefficients, which are assumed to be known with certainty to p figures, and one with the coefficients rounded to (p-1) figures, and to count the number of figures to which the two solutions agree. In ill-conditioned problems this can be surprisingly small and even zero.

2.2 The Least Squares Solution of Axb, m>n, rank(A)=n

The least squares solution is the vector x^ which minimizes the sum of the squares of the residuals,
The solution is obtained in two steps.
  1. (a)Householder transformations are used to reduce A to ‘simpler form’ via the equation QA=R, where R has the appearance
    with R^ a nonsingular upper triangular n×n matrix and 0 a zero matrix of shape (m-n)×n. Similar operations convert b to Qb=c, where
    with c1 having n rows and c2 having (m-n) rows.
  2. (b)The required least squares solution is obtained by back-substitution in the equation
Again due to rounding errors the computed x^0 is only an approximation to the required x^, but as in Section 2.1, this can be improved by ‘iterative refinement’. The first correction d is the solution of the least squares problem
and since the matrix A is unchanged, this computation takes less time than that of the original x^0. The process can be repeated until further corrections are (a) negligible or (b) show no further decrease.

2.3 Rank-deficient Cases

If, in the least squares problem just discussed, rank(A)<n, then a least squares solution exists but it is not unique. In this situation it is usual to ask for the least squares solution ‘of minimal length’, i.e., the vector x which minimizes x2, among all those x for which b-Ax2 is a minimum.
This can be computed from the Singular Value Decomposition (SVD) of A, in which A is factorized as
where Q is an m×n matrix with orthonormal columns, P is an n×n orthogonal matrix and D is an n×n diagonal matrix. The diagonal elements of D are called the ‘singular values’ of A; they are non-negative and can be arranged in decreasing order of magnitude:
The columns of Q and P are called respectively the left and right singular vectors of A. If the singular values dr+1,,dn are zero or negligible, but dr is not negligible, then the rank of A is taken to be r (see also Section 2.4) and the minimal length least squares solution of Axb is given by
where D is the diagonal matrix with diagonal elements d1−1,d2−1,,dr−1,0,,0.
The SVD may also be used to find solutions to the homogeneous system of equations Ax=0, where A is m×n. Such solutions exist if and only if rank(A)<n, and are given by
where the αi are arbitrary numbers and the pi are the columns of P which correspond to negligible elements of D.
The general solution to the rank-deficient least squares problem is given by x^+x, where x^ is the minimal length least squares solution and x is any solution of the homogeneous system of equations Ax=0.

2.4 The Rank of a Matrix

In theory the rank is r if n-r elements of the diagonal matrix D of the singular value decomposition are exactly zero. In practice, due to rounding and/or experimental errors, some of these elements have very small values which usually can and should be treated as zero.
For example, the following 5×8 matrix has rank 3 in exact arithmetic:
( 22 14 −1 −3 9 9 2 4 10 7 13 −2 8 1 −6 5 2 10 −1 13 1 −7 6 0 3 0 −11 −2 −2 5 5 −2 7 8 3 4 4 −1 1 2 ) .  
On a computer with 7 decimal digits of precision the computed singular values were
3.5×101,   2.0×101,   2.0×101,   1.3×10−6,   5.5×10−7  
and the rank would be correctly taken to be 3.
It is not, however, always certain that small computed singular values are really zero. With the 7×7 Hilbert matrix, for example, where aij=1/(i+j-1), the singular values are
1.7,  2.7×10−1,  2.1×10−2,  1.0×10−3,  2.9×10−5,  4.9×10−7,  3.5×10−9.  
Here there is no clear cut-off between small (i.e., negligible) singular values and larger ones. In fact, in exact arithmetic, the matrix is known to have full rank and none of its singular values is zero. On a computer with 7 decimal digits of precision, the matrix is effectively singular, but should its rank be taken to be 6, or 5, or 4?
It is, therefore, impossible to give an infallible rule, but generally the rank can be taken to be the number of singular values which are neither zero nor very small compared with other singular values. For example, if there is a sharp decrease in singular values from numbers of order unity to numbers of order 10−7, then the latter will almost certainly be zero in a machine in which 7 significant decimal figures is the maximum accuracy. Similarly for a least squares problem in which the data is known to about four significant figures and the largest singular value is of order unity then a singular value of order 10−4 or less should almost certainly be regarded as zero.
It should be emphasized that rank determination and least squares solutions can be sensitive to the scaling of the matrix. If at all possible the units of measurement should be chosen so that the elements of the matrix have data errors of approximately equal magnitude.

2.5 Generalized Linear Least Squares Problems

The simple type of linear least squares problem described in Section 2.2 can be generalized in various ways.
  1. 1.Linear least squares problems with equality constraints:
    find ​x​ to minimize ​S=c-Ax22  subject to  Bx=d,  
    where A is m×n and B is p×n, with pnm+p. The equations Bx=d may be regarded as a set of equality constraints on the problem of minimizing S. Alternatively the problem may be regarded as solving an overdetermined system of equations
    ( A B ) x=( c d ) ,  
    where some of the equations (those involving B) are to be solved exactly, and the others (those involving A) are to be solved in a least squares sense. The problem has a unique solution on the assumptions that B has full row rank p and the matrix ( A B ) has full column rank n. (For linear least squares problems with inequality constraints, refer to Chapter E04.)
  2. 2.General Gauss–Markov linear model problems:
    minimize ​y2  subject to  d=Ax+By,  
    where A is m×n and B is m×p, with nmn+p. When B=I, the problem reduces to an ordinary linear least squares problem. When B is square and nonsingular, it is equivalent to a weighted linear least squares problem:
    find ​x​ to minimize ​B-1(d-Ax)2.  
    The problem has a unique solution on the assumptions that A has full column rank n, and the matrix (A,B) has full row rank m.

2.6 Calculating the Inverse of a Matrix

The routines in this chapter can also be used to calculate the inverse of a square matrix A by solving the equation
where I is the identity matrix. However, solving the equations AX=B by calculation of the inverse of the coefficient matrix A, i.e., by X=A-1B, is definitely not recommended.
Similar remarks apply to the calculation of the pseudo-inverse of a singular or rectangular matrix.

2.7 Estimating the 1-norm of a Matrix

The 1-norm of a matrix A is defined to be:
A1 = max 1jn i=1 m |aij| (1)
Typically it is useful to calculate the condition number of a matrix with respect to the solution of linear equations, or inversion. The higher the condition number the less accuracy might be expected from a numerical computation. A condition number for the solution of linear equations is A.A-1. Since this might be a relatively expensive computation it often suffices to estimate the norm of each matrix.

3 Recommendations on Choice and Use of Available Routines

See also Section 3 in the F07 Chapter Introduction for recommendations on the choice of available routines from that chapter.

3.1 Black Box and General Purpose Routines

Most of the routines in this chapter are categorised either as Black Box routines or general purpose routines.
Black Box routines solve the equations Axi=bi, for i=1,2,,p, in a single call with the matrix A and the right-hand sides, bi, being supplied as data. These are the simplest routines to use and are suitable when all the right-hand sides are known in advance and do not occupy too much storage.
General purpose routines, in general, require a previous call to a routine in Chapters F01 or F07 to factorize the matrix A. This factorization can then be used repeatedly to solve the equations for one or more right-hand sides which may be generated in the course of the computation. The Black Box routines simply call a factorization routine and then a general purpose routine to solve the equations.
The routine f04qaf which uses an iterative method for sparse systems of equations does not fit easily into this categorization, but is classified as a general purpose routine in the decision trees and indexes.

3.2 Systems of Linear Equations

Most of the routines in this chapter solve linear equations Ax=b when A is n×n and a unique solution is expected (see Section 2.1). The matrix A may be ‘general’ real or complex, or may have special structure or properties, e.g., it may be banded, tridiagonal, almost block-diagonal, sparse, symmetric, Hermitian, positive definite (or various combinations of these).
It must be emphasized that it is a waste of computer time and space to use an inappropriate routine, for example one for the complex case when the equations are real. It is also unsatisfactory to use the special routines for a positive definite matrix if this property is not known in advance.
Routines are given for calculating the approximate solution, that is solving the linear equations just once, and also for obtaining the accurate solution by successive iterative corrections of this first approximation using additional precision arithmetic, as described in Section 2.1. The latter, of course, are more costly in terms of time and storage, since each correction involves the solution of n sets of linear equations and since the original A and its LU decomposition must be stored together with the first and successively corrected approximations to the solution. In practice the storage requirements for the ‘corrected’ routines are about double those of the ‘approximate’ routines, though the extra computer time may not be prohibitive since the same matrix and the same LU decomposition is used in every linear equation solution.
A number of the Black Box routines in this chapter return estimates of the condition number and the forward error, along with the solution of the equations. But for those routines that do not return a condition estimate two routines are provided – f04ydf for real matrices, f04zdf for complex matrices – which can return a cheap but reliable estimate of A-1, and hence an estimate of the condition number κ(A) (see Section 2.1). These routines can also be used in conjunction with most of the linear equation solving routines in Chapter F11: further advice is given in the routine documents.
Other routines for solving linear equation systems, computing inverse matrices, and estimating condition numbers can be found in Chapter F07, which contains LAPACK software.

3.3 Linear Least Squares Problems

The majority of the routines for solving linear least squares problems are to be found in Chapter F08.
For the case described in Section 2.2, when mn and a unique least squares solution is expected, there are two routines for a general real A, one of which (f04jgf) computes a first approximation and the other (f04amf) computes iterative corrections. If it transpires that rank(A)<n, so that the least squares solution is not unique, then f04amf takes a failure exit, but f04jgf proceeds to compute the minimal length solution by using the SVD (see below).
If A is expected to be of less than full rank then one of the routines for calculating the minimal length solution may be used.
For mn the use of the SVD is not significantly more expensive than the use of routines based upon the QR factorization.
Problems with linear equality constraints can be solved by f08zaf (for real data) or by f08znf (for complex data), provided that the problems are of full rank. Problems with linear inequality constraints can be solved by e04ncf/​e04nca in Chapter E04.
General Gauss–Markov linear model problems, as formulated in Section 2.5, can be solved by f08zbf (for real data) or by f08zpf (for complex data).

3.4 Sparse Matrix Routines

Routines specifically for sparse matrices are appropriate only when the number of nonzero elements is very small, less than, say, 10% of the n2 elements of A, and the matrix does not have a relatively small band width.
Chapter F11 contains routines for both the direct and iterative solution of sparse linear systems. There are two routines in Chapter F04 for solving sparse linear equations (f04axf and f04qaf). f04axf utilizes a factorization of the matrix A obtained from f01brf or f01bsf, while f04qaf uses an iterative technique and requires a user-supplied function to compute matrix-vector products Ac and ATc for any given vector c.
f04qaf solves sparse least squares problems by an iterative technique, and also allows the solution of damped (regularized) least squares problems (see the routine document for details).

4 Decision Trees

The name of the routine (if any) that should be used to factorize the matrix A is given in brackets after the name of the routine for solving the equations.

Tree 1: Black Box routines for unique solution of Ax=b (Real matrix)

Is A symmetric?   Is A positive definite?   Is A a band matrix?   Is A tridiagonal?   f04bgf (see Note 1) or f07jaf or f07jbf (see Note 2)
  no   no   no   no
f04bff (see Note 1) or f07haf or f07hbf (see Note 2)
Is A a Toeplitz matrix?   Are the equations the Yule–Walker equations?   f04fef
  no   no
Do you require an accurate solution using iterative refinement?   f07fbf
Is one triangle of A stored as a linear array?   f04bef (see Note 1) or f07gaf or f07gbf (see Note 2)
f04bdf (see Note 1) or f07faf or f07fbf (see Note 2)
Is one triangle of A stored as a linear array?   f04bjf (see Note 1) or f07paf or f07pbf (see Note 2)
f04bhf (see Note 1) or f07maf or f07mbf (see Note 2)
Is A a band matrix?   Is A tridiagonal?   f04bcf (see Note 1) or f07caf or f07cbf (see Note 2)
  no   no
f04bbf (see Note 1) or f07baf or f07bbf (see Note 2)
Do you require an accurate solution using iterative refinement?   f07abf
f04baf (see Note 1) or f07aaf or f07abf (see Note 2)

Tree 2: Black Box routines for unique solution of Ax=b (Complex matrix)

Is A Hermitian?   Is A positive definite?   Is A a band matrix?   Is A tridiagonal?   f04cgf (see Note 1) or
f07jnf or f07jpf (see Note 2)
  no   no   no   no
f04cff (see Note 1) or f07hnf or f07hpf (see Note 2)
Is one triangle of A stored as a linear array?   f04cef (see Note 1) or f07gnf or f07gpf (see Note 2)
f04cdf (see Note 1) or f07fnf or f07fpf (see Note 2)
Is one triangle of A stored as a linear array?   f04cjf (see Note 1) or f07pnf or f07ppf (see Note 2)
f04chf (see Note 1) or f07mnf or f07mpf (see Note 2)
Is A symmetric?   Is one triangle of A stored as a linear array?   f04djf (see Note 1) or f07qnf or f07qpf (see Note 2)
  no   no
f04dhf (see Note 1) or f07nnf or f07npf (see Note 2)
Is A a band matrix?   Is A tridiagonal?   f04ccf (see Note 1) or f07cnf or f07cpf (see Note 2)
  no   no
f04cbf (see Note 1) or f07bnf or f07bpf (see Note 2)
f04caf (see Note 1) or f07anf or f07apf (see Note 2)

Tree 3: General purpose routines for unique solution of Ax=b (Real matrix)

Is A a sparse matrix and not banded?   Chapter F11 or f04axf (f01brf or f01bsf) or f04qaf
Is A symmetric?   Is A positive definite?   Is A band matrix?   Is A tridiagonal?   f07jef (f07jdf)
  no   no   no   no
Is A variable band width?   f04mcf (f01mcf)
f07hef (f07hdf)
Is A a Toeplitz matrix?   Are the equations the Yule–Walker equations?   f04mef
  no   no
Is one triangle of A stored as a linear array?   f07gef (f07gdf)
f07fef (f07fdf)
Is one triangle of A stored as a linear array?   f07pef (f07pdf)
f07mef (f07mdf)
Is A triangular?   Is A a band matrix?   f07vef
  no   no
Is A stored as a linear array?   f07uef
Is A a band matrix?   Is A tridiagonal?   f04lef (f01lef) or f07cef (f07cdf)
  no   no
Is A almost block diagonal?   f04lhf (f01lhf)
f07bef (f07bdf)
f07aef (f07adf)

Tree 4: General purpose routines for unique solution of Ax=b (Complex matrix)

Is A a sparse matrix and not banded?   Chapter F11
Is A Hermitian?   Is A positive definite?   Is A a band matrix?   Is A tridiagonal?   f07jsf (f07jrf)
  no   no   no   no
f07hsf (f07hrf)
Is one triangle of A stored as a linear array?   f07gsf (f07grf)
f07fsf (f07frf)
Is one triangle of A stored as a linear array?   f07psf (f07prf)
f07msf (f07mrf)
Is A symmetric?   Is one triangle of A stored as a linear array?   f07qsf (f07qrf)
  no   no
f07nsf (f07nrf)
Is A triangular?   Is A a band matrix?   f07vsf
  no   no
Is A stored as a linear array?   f07usf
Is A a band matrix?   Is A tridiagonal?   f07csf (f07crf)
  no   no
f07bsf (f07brf)
f07asf (f07arf)

Tree 5: General purpose routines for least squares and homogeneous equations (without constraints)

Is the problem Ax=0?   f08kaf
Is A sparse?   f04qaf
Is rank(A)=n?   Are storage and time more important than accuracy?   f04jgf
  no   no
Is m>n?   f04jgf or f08kaf
Note: there are also routines in Chapter F08 for solving least squares problems.
Note 1: also returns an estimate of the condition number and the forward error.
Note 2: also returns an estimate of the condition number, the forward error and the backward error. Requires additional workspace.

5 Functionality Index

Black Box routines, Ax=b,  
complex general band matrix   f04cbf
complex general matrix   f04caf
complex general tridiagonal matrix   f04ccf
complex Hermitian matrix,  
packed matrix format   f04cjf
standard matrix format   f04chf
complex Hermitian positive definite band matrix   f04cff
complex Hermitian positive definite matrix,  
packed matrix format   f04cef
standard matrix format   f04cdf
complex Hermitian positive definite tridiagonal matrix   f04cgf
complex symmetric matrix,  
packed matrix format   f04djf
standard matrix format   f04dhf
real general band matrix   f04bbf
real general matrix,  
multiple right-hand sides,  
iterative refinement using additional precision   f04aef
multiple right-hand sides, standard precision   f04baf
single right-hand side,  
iterative refinement using additional precision   f04atf
real general tridiagonal matrix   f04bcf
real symmetric matrix,  
packed matrix format   f04bjf
standard matrix format   f04bhf
real symmetric positive definite band matrix   f04bff
real symmetric positive definite matrix,  
multiple right-hand sides,  
iterative refinement using additional precision   f04abf
multiple right-hand sides, standard precision   f04bdf
packed matrix format   f04bef
single right-hand side,  
iterative refinement using additional precision   f04asf
real symmetric positive definite Toeplitz matrix,  
general right-hand side   f04fff
Yule–Walker equations   f04fef
real symmetric positive definite tridiagonal matrix   f04bgf
General Purpose routines, Ax=b,  
real almost block-diagonal matrix   f04lhf
real band symmetric positive definite matrix, variable bandwidth   f04mcf
real sparse matrix,  
direct method   f04axf
iterative method   f04qaf
real symmetric positive definite Toeplitz matrix,  
general right-hand side, update solution   f04mff
Yule–Walker equations, update solution   f04mef
real tridiagonal matrix   f04lef
Least squares and Homogeneous Equations,  
real m×n matrix,  
mn, rank =n or minimal solution   f04jgf
rank =n, iterative refinement   f04amf
real sparse matrix   f04qaf
Service Routines,  
complex rectangular matrix,  
norm and condition number estimation   f04zdf
real matrix,  
covariance matrix for linear least squares problems   f04yaf
real rectangular matrix,  
norm and condition number estimation   f04ydf

6 Auxiliary Routines Associated with Library Routine Arguments


7 Withdrawn or Deprecated Routines

The following lists all those routines that have been withdrawn since Mark 24 of the Library or are in the Library, but deprecated.
Routine Status Replacement Routine(s)
f04abf Withdrawn at Mark 28.3 f07fbf
f04aef Withdrawn at Mark 28.3 f07abf
f04aff Withdrawn at Mark 25 No replacement routine required
f04agf Withdrawn at Mark 25 No replacement routine required
f04ahf Withdrawn at Mark 25 No replacement routine required
f04ajf Withdrawn at Mark 25 No replacement routine required
f04asf Withdrawn at Mark 28.3 f07fbf
f04atf Withdrawn at Mark 28.3 f07abf
f04ycf Withdrawn at Mark 26 f04ydf
f04zcf Withdrawn at Mark 26 f04zdf

8 References

Golub G H and Van Loan C F (1996) Matrix Computations (3rd Edition) Johns Hopkins University Press, Baltimore
Lawson C L and Hanson R J (1974) Solving Least Squares Problems Prentice–Hall
Wilkinson J H and Reinsch C (1971) Handbook for Automatic Computation II, Linear Algebra Springer–Verlag