PDF version (NAG web site
, 64-bit version, 64-bit version)
NAG Toolbox Chapter Introduction
F04 — simultaneous linear equations
Scope of the Chapter
This chapter is concerned with the solution of the matrix equation , where may be a single vector or a matrix of multiple right-hand sides. The matrix may be real, complex, symmetric, Hermitian, positive definite, positive definite Toeplitz or banded. It may also be rectangular, in which case a least squares solution is obtained.
Much of the functionality of this chapter has been superseded by functions from
Chapters F07 and
F08 (LAPACK routines) as those chapters have grown and have included driver and expert driver functions.
For a general introduction to sparse systems of equations, see the
F11 Chapter Introduction, which provides functions for large sparse systems.
Some functions for sparse problems are also included in this chapter; they are described in
Sparse Matrix Routines.
Background to the Problems
A set of linear equations may be written in the form
where the known matrix
, with real or complex coefficients, is of size
by
(
rows and
columns), the known right-hand vector
has
components (
rows and one column), and the required solution vector
has
components (
rows and one column). There may also be
vectors
, for
, on the right-hand side and the equations may then be written as
the required matrix
having as its
columns the solutions of
, for
. Some functions deal with the latter case, but for clarity only the case
is discussed here.
The most common problem, the determination of the unique solution of
, occurs when
and
is not singular, that is
. This is discussed in
Unique Solution of below. The next most common problem, discussed in
The Least Squares Solution of , , below, is the determination of the least squares solution of
required when
and
, i.e., the determination of the vector
which minimizes the norm of the residual vector
. All other cases are rank deficient, and they are treated in
Rank-deficient Cases.
Unique Solution of Ax=b
Most functions in this chapter solve this particular problem. The computation starts with the triangular decomposition
, where
and
are respectively lower and upper triangular matrices and
is a permutation matrix, chosen so as to ensure that the decomposition is numerically stable. The solution is then obtained by solving in succession the simpler equations
the first by forward-substitution and the second by back-substitution.
If is real symmetric and positive definite, , while if is complex Hermitian and positive definite, ; in both these cases is the identity matrix (i.e., no permutations are necessary). In all other cases either or has unit diagonal elements.
Due to rounding errors the computed ‘solution’
, say, is only an approximation to the true solution . This approximation will sometimes be satisfactory, agreeing with to several figures, but if the problem is ill-conditioned then and may have few or even no figures in common, and at this stage there is no means of estimating the ‘accuracy’ of .
There are three possible approaches to estimating the accuracy of a computed solution.
One way to do so, and to ‘correct’
when this is meaningful (see next paragraph), involves computing the residual vector in extended precision arithmetic, and obtaining a correction vector by solving . The new approximate solution is usually more accurate and the correcting process is repeated until (a) further corrections are negligible or (b) they show no further decrease.
It must be emphasized that the ‘true’ solution may not be meaningful, that is correct to all figures quoted, if the elements of and are known with certainty only to say figures, where is less than full precision.
The first correction vector will then give some useful information about the number of figures in the ‘solution’ which probably remain unchanged with respect to maximum possible uncertainties in the coefficients.
An alternative
approach to assessing the accuracy of the solution is to compute or estimate the
condition number of
, defined as
Roughly speaking, errors or uncertainties in
or
may be amplified in the solution by a factor
. Thus, for example, if the data in
and
are only accurate to
digits and
, then the solution cannot be guaranteed to have more than
correct digits. If
, the solution may have no meaningful digits.
To be more precise, suppose that
Here
and
represent perturbations to the matrices
and
which cause a perturbation
in the solution. We can define measures of the relative sizes of the perturbations in
,
and
as
Then
provided that
. Often
and then the bound effectively simplifies to
Hence, if we know
,
and
, we can compute a bound on the relative errors in the solution. Note that
,
and
are defined in terms of the norms of
,
and
. If
,
or
contains elements of widely differing magnitude, then
,
and
will be dominated by the errors in the larger elements, and
will give no information about the relative accuracy of smaller elements of
.
A third
way to obtain useful information about the accuracy of a solution is to solve two sets of equations, one with the given coefficients, which are assumed to be known with certainty to figures, and one with the coefficients rounded to () figures, and to count the number of figures to which the two solutions agree. In ill-conditioned problems this can be surprisingly small and even zero.
The Least Squares Solution of Ax≃b, m>n, rankA=n
The least squares solution is the vector
which minimizes the sum of the squares of the residuals,
The solution is obtained in two steps.
(a) |
Householder transformations are used to reduce to ‘simpler form’ via the equation , where has the appearance
with a nonsingular upper triangular by matrix and a zero matrix of shape by . Similar operations convert to , where
with having rows and having () rows. |
(b) |
The required least squares solution is obtained by back-substitution in the equation
|
Again due to rounding errors the computed
is only an approximation to the required
, but as in
Unique Solution of , this can be improved by ‘iterative refinement’. The first correction
is the solution of the least squares problem
and since the matrix
is unchanged, this computation takes less time than that of the original
. The process can be repeated until further corrections are
(a) negligible or
(b) show no further decrease.
Rank-deficient Cases
If, in the least squares problem just discussed, , then a least squares solution exists but it is not unique. In this situation it is usual to ask for the least squares solution ‘of minimal length’, i.e., the vector which minimizes , among all those for which is a minimum.
This can be computed from the Singular Value Decomposition (SVD) of
, in which
is factorized as
where
is an
by
matrix with orthonormal columns,
is an
by
orthogonal matrix and
is an
by
diagonal matrix. The diagonal elements of
are called the ‘singular values’ of
; they are non-negative and can be arranged in decreasing order of magnitude:
The columns of
and
are called respectively the left and right singular vectors of
. If the singular values
are zero or negligible, but
is not negligible, then the rank of
is taken to be
(see also
The Rank of a Matrix) and the minimal length least squares solution of
is given by
where
is the diagonal matrix with diagonal elements
.
The SVD may also be used to find solutions to the homogeneous system of equations
, where
is
by
. Such solutions exist if and only if
, and are given by
where the
are arbitrary numbers and the
are the columns of
which correspond to negligible elements of
.
The general solution to the rank-deficient least squares problem is given by , where is the minimal length least squares solution and is any solution of the homogeneous system of equations .
The Rank of a Matrix
In theory the rank is if elements of the diagonal matrix of the singular value decomposition are exactly zero. In practice, due to rounding and/or experimental errors, some of these elements have very small values which usually can and should be treated as zero.
For example, the following
by
matrix has rank
in exact arithmetic:
On a computer with
decimal digits of precision the computed singular values were
and the rank would be correctly taken to be
.
It is not, however, always certain that small computed singular values are really zero. With the
by
Hilbert matrix, for example, where
, the singular values are
Here there is no clear cut-off between small (i.e., negligible) singular values and larger ones. In fact, in exact arithmetic, the matrix is known to have full rank and none of its singular values is zero. On a computer with
decimal digits of precision, the matrix is effectively singular, but should its rank be taken to be
, or
, or
?
It is therefore impossible to give an infallible rule, but generally the rank can be taken to be the number of singular values which are neither zero nor very small compared with other singular values. For example, if there is a sharp decrease in singular values from numbers of order unity to numbers of order , then the latter will almost certainly be zero in a machine in which significant decimal figures is the maximum accuracy. Similarly for a least squares problem in which the data is known to about four significant figures and the largest singular value is of order unity then a singular value of order or less should almost certainly be regarded as zero.
It should be emphasized that rank determination and least squares solutions can be sensitive to the scaling of the matrix. If at all possible the units of measurement should be chosen so that the elements of the matrix have data errors of approximately equal magnitude.
Generalized Linear Least Squares Problems
The simple type of linear least squares problem described in
The Least Squares Solution of , , can be generalized in various ways.
1. |
Linear least squares problems with equality constraints:
where is by and is by , with . The equations may be regarded as a set of equality constraints on the problem of minimizing . Alternatively the problem may be regarded as solving an overdetermined system of equations
where some of the equations (those involving ) are to be solved exactly, and the others (those involving ) are to be solved in a least squares sense. The problem has a unique solution on the assumptions that has full row rank and the matrix has full column rank . (For linear least squares problems with inequality constraints, refer to Chapter E04.) |
2. |
General Gauss–Markov linear model problems:
where is by and is by , with . When , the problem reduces to an ordinary linear least squares problem. When is square and nonsingular, it is equivalent to a weighted linear least squares problem:
The problem has a unique solution on the assumptions that has full column rank , and the matrix has full row rank . |
Calculating the Inverse of a Matrix
The functions in this chapter can also be used to calculate the inverse of a square matrix
by solving the equation
where
is the identity matrix. However, solving the equations
by calculation of the inverse of the coefficient matrix
, i.e., by
, is
definitely not recommended.
Similar remarks apply to the calculation of the pseudo-inverse of a singular or rectangular matrix.
Estimating the 1-norm of a Matrix
The 1-norm of a matrix
is defined to be:
Typically it is useful to calculate the condition number of a matrix with respect to the solution of linear equations, or inversion. The higher the condition number the less accuracy might be expected from a numerical computation. A condition number for the solution of linear equations is . Since this might be a relatively expensive computation it often suffices to estimate the norm of each matrix.
Recommendations on Choice and Use of Available Functions
See also
Recommendations on Choice and Use of Available Functions in the F07 Chapter Introduction for recommendations on the choice of available functions from that chapter.
Black Box and General Purpose Functions
Most of the functions in this chapter are categorised either as Black Box functions or general purpose functions.
Black Box functions solve the equations , for , in a single call with the matrix and the right-hand sides, , being supplied as data. These are the simplest functions to use and are suitable when all the right-hand sides are known in advance and do not occupy too much storage.
General purpose functions, in general, require a previous call to a function in
Chapters F01 or
F07 to factorize the matrix
. This factorization can then be used repeatedly to solve the equations for one or more right-hand sides which may be generated in the course of the computation. The Black Box functions simply call a factorization function and then a general purpose function to solve the equations.
The function
nag_linsys_real_gen_sparse_lsqsol (f04qa) which uses an iterative method for sparse systems of equations does not fit easily into this categorization, but is classified as a general purpose function in the decision trees and indexes.
Systems of Linear Equations
Most of the functions in this chapter solve linear equations
when
is
by
and a unique solution is expected (see
Unique Solution of ). The matrix
may be ‘general’ real or complex, or may have special structure or properties, e.g., it may be banded, tridiagonal, almost block-diagonal, sparse, symmetric, Hermitian, positive definite (or various combinations of these).
It must be emphasized that it is a waste of computer time and space to use an inappropriate function, for example one for the complex case when the equations are real. It is also unsatisfactory to use the special functions for a positive definite matrix if this property is not known in advance.
Functions are given for calculating the
approximate solution, that is solving the linear equations just once, and also for obtaining the
accurate solution by successive iterative corrections of this first approximation using additional precision arithmetic, as described in
Unique Solution of . The latter, of course, are more costly in terms of time and storage, since each correction involves the solution of
sets of linear equations and since the original
and its
decomposition must be stored together with the first and successively corrected approximations to the solution. In practice the storage requirements for the ‘corrected’ functions are about double those of the ‘approximate’ functions, though the extra computer time may not be prohibitive since the same matrix and the same
decomposition is used in every linear equation solution.
A number of the Black Box functions in this chapter return estimates of the condition number and the forward error, along with the solution of the equations. But for those functions that do not return a condition estimate two functions are provided –
nag_linsys_real_gen_norm_rcomm (f04yd) for real matrices,
nag_linsys_complex_gen_norm_rcomm (f04zd) for complex matrices – which can return a cheap but reliable estimate of
, and hence an estimate of the
condition number
(see
Unique Solution of ). These functions can also be used in conjunction with most of the linear equation solving functions in
Chapter F11: further advice is given in the function documents.
Other functions for solving linear equation systems, computing inverse matrices, and estimating condition numbers can be found in
Chapter F07, which contains LAPACK software.
Linear Least Squares Problems
The majority of the functions for solving linear least squares problems are to be found in
Chapter F08.
For the case described in
The Least Squares Solution of , , , when
and a unique least squares solution is expected, there are two functions for a general real
, one of which (
nag_linsys_real_gen_solve (f04jg)) computes a first approximation and the other (
nag_linsys_real_gen_lsqsol (f04am)) computes iterative corrections. If it transpires that
, so that the least squares solution is not unique, then
nag_linsys_real_gen_lsqsol (f04am) takes a failure exit, but
nag_linsys_real_gen_solve (f04jg) proceeds to compute the
minimal length solution by using the SVD (see below).
If is expected to be of less than full rank then one of the functions for calculating the minimal length solution may be used.
For the use of the SVD is not significantly more expensive than the use of functions based upon the factorization.
Problems with linear
equality constraints can be solved by
nag_lapack_dgglse (f08za) (for real data) or by
nag_lapack_zgglse (f08zn) (for complex data),
provided that the problems are of full rank. Problems with linear
inequality constraints can be solved by
nag_opt_lsq_lincon_solve (e04nc) in
Chapter E04.
General Gauss–Markov linear model problems, as formulated in
Generalized Linear Least Squares Problems, can be solved by
nag_lapack_dggglm (f08zb) (for real data) or by
nag_lapack_zggglm (f08zp) (for complex data).
Sparse Matrix Functions
Functions specifically for sparse matrices are appropriate only when the number of nonzero elements is very small, less than, say, 10% of the elements of , and the matrix does not have a relatively small band width.
Chapter F11 contains functions for both the direct and iterative solution of sparse linear systems. There are two functions in
Chapter F04 for solving sparse linear equations (
nag_linsys_real_sparse_fac_solve (f04ax) and
nag_linsys_real_gen_sparse_lsqsol (f04qa)).
nag_linsys_real_sparse_fac_solve (f04ax) utilizes a factorization of the matrix
obtained from
nag_matop_real_gen_sparse_lu (f01br) or
nag_matop_real_gen_sparse_lu_reuse (f01bs), while
nag_linsys_real_gen_sparse_lsqsol (f04qa) uses an iterative technique and requires a user-supplied function to compute matrix-vector products
and
for any given vector
.
nag_linsys_real_gen_sparse_lsqsol (f04qa) solves sparse least squares problems by an iterative technique, and also allows the solution of damped (regularized) least squares problems (see the function document for details).
Decision Trees
The name of the function (if any) that should be used to factorize the matrix is given in brackets after the name of the function for solving the equations.
Tree 1: Black Box functions for unique solution of (Real matrix)
Tree 2: Black Box functions for unique solution of (Complex matrix)
Tree 3: General purpose functions for unique solution of (Real matrix)
Tree 4: General purpose functions for unique solution of (Complex matrix)
Tree 5: General purpose functions for least squares and homogeneous equations (without constraints)
Note: there are also functions in
Chapter F08 for solving least squares problems.
Note 1: also returns an estimate of the condition number and the forward error.
Note 2: also returns an estimate of the condition number, the forward error and the backward error. Requires additional workspace.
Note 3: for a single right-hand side only.
Functionality Index
Black Box functions, Ax = b, | | |
complex Hermitian matrix, | | |
complex Hermitian positive definite matrix, | | |
complex symmetric matrix, | | |
multiple right-hand sides, | | |
real symmetric positive definite matrix, | | |
multiple right-hand sides, | | |
real symmetric positive definite Toeplitz matrix, | | |
General Purpose functions, Ax = b, | | |
real symmetric positive definite Toeplitz matrix, | | |
Least squares and Homogeneous Equations, | | |
complex rectangular matrix, | | |
References
Golub G H and Van Loan C F (1996) Matrix Computations (3rd Edition) Johns Hopkins University Press, Baltimore
Lawson C L and Hanson R J (1974) Solving Least Squares Problems Prentice–Hall
Wilkinson J H and Reinsch C (1971) Handbook for Automatic Computation II, Linear Algebra Springer–Verlag
PDF version (NAG web site
, 64-bit version, 64-bit version)
© The Numerical Algorithms Group Ltd, Oxford, UK. 2009–2015