This chapter provides functions for various types of matrix eigenvalue problem:
standard eigenvalue problems (finding eigenvalues and eigenvectors of a square matrix );
singular value problems (finding singular values and singular vectors of a rectangular matrix );
generalized eigenvalue problems (finding eigenvalues and eigenvectors of a matrix pencil ).
quadratic eigenvalue problems (finding eigenvalues and eigenvectors of the quadratic ).
Functions are provided for both real and complex data.
The majority of functions for these problems can be found in Chapter F08 which contains software derived from LAPACK (see Anderson et al. (1999)). However, you should read the F02 Chapter Introduction before turning to Chapter F08, especially if you are a new user. Chapter F12 contains functions for large sparse eigenvalue problems, although one such function is also available in this chapter.
Chapters F02 and F08 contain Black Box (or Driver) functions that enable many problems to be solved by a call to a single function, and the decision trees in Section 4 direct you to the most appropriate functions in Chapters F02 and F08. The Chapter F02 functions call functions in Chapters F07 and F08 wherever possible to perform the computations, and there are pointers in Section 4 to the relevant decision trees in Chapter F08.
2Background to the Problems
Here we describe the different types of problem which can be tackled by the functions in this chapter, and give a brief outline of the methods used to solve them. If you have one specific type of problem to solve, you need only read the relevant sub-section and then turn to Section 3. Consult a standard textbook for a more thorough discussion, for example Golub and Van Loan (1996) or Parlett (1998).
In each sub-section, we first describe the problem in terms of real matrices. The changes needed to adapt the discussion to complex matrices are usually simple and obvious: a matrix transpose such as must be replaced by its conjugate transpose ; symmetric matrices must be replaced by Hermitian matrices, and orthogonal matrices by unitary matrices. Any additional changes are noted at the end of the sub-section.
2.1Standard Eigenvalue Problems
Let be a square matrix of order . The standard eigenvalue problem is to find eigenvalues, , and corresponding eigenvectors, , such that
(1)
(The phrase ‘eigenvalue problem’ is sometimes abbreviated to eigenproblem.)
2.1.1Standard symmetric eigenvalue problems
If is real symmetric, the eigenvalue problem has many desirable features, and it is advisable to take advantage of symmetry whenever possible.
The eigenvalues are all real, and the eigenvectors can be chosen to be mutually orthogonal. That is, we can write
or equivalently:
(2)
where is a real diagonal matrix whose diagonal elements are the eigenvalues, and is a real orthogonal matrix whose columns are the eigenvectors. This implies that if , and .
This is known as the eigen-decomposition or spectral factorization of .
Eigenvalues of a real symmetric matrix are well-conditioned, that is, they are not unduly sensitive to perturbations in the original matrix . The sensitivity of an eigenvector depends on how small the gap is between its eigenvalue and any other eigenvalue: the smaller the gap, the more sensitive the eigenvector. More details on the accuracy of computed eigenvalues and eigenvectors are given in the function documents, and in the F08 Chapter Introduction.
For dense or band matrices, the computation of eigenvalues and eigenvectors proceeds in the following stages:
1. is reduced to a symmetric tridiagonal matrix by an orthogonal similarity transformation: , where is orthogonal. (A tridiagonal matrix is zero except for the main diagonal and the first subdiagonal and superdiagonal on either side.) has the same eigenvalues as and is easier to handle.
2.Eigenvalues and eigenvectors of are computed as required. If all eigenvalues (and optionally eigenvectors) are required, they are computed by the algorithm, which effectively factorizes as , where is orthogonal, or by the divide-and-conquer method.
If only selected eigenvalues are required, they are computed by bisection, and if selected eigenvectors are required, they are computed by inverse iteration. If is an eigenvector of , then is an eigenvector of .
All the above remarks also apply – with the obvious changes – to the case when is a complex Hermitian matrix. The eigenvectors are complex, but the eigenvalues are all real, and so is the tridiagonal matrix .
If is large and sparse, the methods just described would be very wasteful in both storage and computing time, and, therefore, an alternative algorithm, known as subspace iteration, is provided (for real problems only) to find a (usually small) subset of the eigenvalues and their corresponding eigenvectors. Chapter F12 contains functions based on the Lanczos method for real symmetric large sparse eigenvalue problems, and these functions are usually more efficient than subspace iteration.
2.1.2Standard nonsymmetric eigenvalue problems
A real nonsymmetric matrix may have complex eigenvalues, occurring as complex conjugate pairs. If is an eigenvector corresponding to a complex eigenvalue , then the complex conjugate vector is the eigenvector corresponding to the complex conjugate eigenvalue . Note that the vector defined in equation (1) is sometimes called a right eigenvector; a left eigenvector
is defined by
Functions in this chapter only compute right eigenvectors (the usual requirement), but functions in Chapter F08 can compute left or right eigenvectors or both.
The eigenvalue problem can be solved via the Schur factorization of , defined as
where is an orthogonal matrix and is a real upper quasi-triangular matrix, with the same eigenvalues as . is called the Schur form of . If all the eigenvalues of are real, then is upper triangular, and its diagonal elements are the eigenvalues of . If has complex conjugate pairs of eigenvalues, then has diagonal blocks, whose eigenvalues are the complex conjugate pairs of eigenvalues of . (The structure of is simpler if the matrices are complex – see below.)
For example, the following matrix is in quasi-triangular form
and has eigenvalues , , and . (The elements indicated by ‘’ may take any values.)
The columns of are called the Schur vectors. For each , the first columns of form an orthonormal basis for the invariant subspace corresponding to the first eigenvalues on the diagonal of . (An invariant subspace (for ) is a subspace such that for any vector in , is also in .) Because this basis is orthonormal, it is preferable in many applications to compute Schur vectors rather than eigenvectors. It is possible to order the Schur factorization so that any desired set of eigenvalues occupy the leading positions on the diagonal of , and functions for this purpose are provided in Chapter F08.
Note that if is symmetric, the Schur vectors are the same as the eigenvectors, but if is nonsymmetric, they are distinct, and the Schur vectors, being orthonormal, are often more satisfactory to work with in numerical computation.
Eigenvalues and eigenvectors of a nonsymmetric matrix may be ill-conditioned, that is, sensitive to perturbations in . Chapter F08 contains functions which compute or estimate the condition numbers of eigenvalues and eigenvectors, and the F08 Chapter Introduction gives more details about the error analysis of nonsymmetric eigenproblems. The accuracy with which eigenvalues and eigenvectors can be obtained is often improved by balancing a matrix. This is discussed further in Section 3.4.
Computation of eigenvalues, eigenvectors or the Schur factorization proceeds in the following stages:
1. is reduced to an upper Hessenberg matrix by an orthogonal similarity transformation: , where is orthogonal. (An upper Hessenberg matrix is zero below the first subdiagonal.) has the same eigenvalues as , and is easier to handle.
2.The upper Hessenberg matrix is reduced to Schur form by the algorithm, giving the Schur factorization . The eigenvalues of are obtained from the diagonal blocks of . The matrix of Schur vectors (if required) is computed as .
3.After the eigenvalues have been found, eigenvectors may be computed, if required, in two different ways. Eigenvectors of can be computed by inverse iteration, and then pre-multiplied by to give eigenvectors of ; this approach is usually preferred if only a few eigenvectors are required. Alternatively, eigenvectors of can be computed by back-substitution, and pre-multiplied by to give eigenvectors of .
All the above remarks also apply – with the obvious changes – to the case when is a complex matrix. The eigenvalues are in general complex, so there is no need for special treatment of complex conjugate pairs, and the Schur form is simply a complex upper triangular matrix.
As for the symmetric eigenvalue problem, if and is large and sparse then it is generally preferable to use an alternative method. Chapter F12 provides functions based on Arnoldi's method for both real and complex matrices, intended to find a subset of the eigenvalues and vectors.
2.2The Singular Value Decomposition
The singular value decomposition (SVD) of a real matrix is given by
where and are orthogonal and is an diagonal matrix with real diagonal elements, , such that
The are the singular values of and the first columns of and are, respectively, the left and right singular vectors of . The singular values and singular vectors satisfy
where and are the th columns of and respectively.
The singular value decomposition of is closely related to the eigen-decompositions of the symmetric matrices or , because:
However, these relationships are not recommended as a means of computing singular values or vectors unless is sparse and functions from Chapter F12 are to be used.
If , denote the leading columns of and respectively, and if denotes the leading principal submatrix of , then
is the best rank- approximation to in both the -norm and the Frobenius norm.
Singular values are well-conditioned; that is, they are not unduly sensitive to perturbations in . The sensitivity of a singular vector depends on how small the gap is between its singular value and any other singular value: the smaller the gap, the more sensitive the singular vector. More details on the accuracy of computed singular values and vectors are given in the function documents and in the F08 Chapter Introduction.
The singular value decomposition is useful for the numerical determination of the rank of a matrix, and for solving linear least squares problems, especially when they are rank-deficient (or nearly so). See Chapter F04.
Computation of singular values and vectors proceeds in the following stages:
1. is reduced to an upper bidiagonal matrix by an orthogonal transformation , where and are orthogonal. (An upper bidiagonal matrix is zero except for the main diagonal and the first superdiagonal.) has the same singular values as , and is easier to handle.
2.The SVD of the bidiagonal matrix is computed as , where and are orthogonal and is diagonal as described above. Then in the SVD of , and .
All the above remarks also apply – with the obvious changes – to the case when is a complex matrix. The singular vectors are complex, but the singular values are real and non-negative, and the bidiagonal matrix is also real.
By formulating the problems appropriately, real large sparse singular value problems may be solved using the symmetric eigenvalue functions in Chapter F12.
2.3Generalized Eigenvalue Problems
Let and be square matrices of order . The generalized eigenvalue problem is to find eigenvalues, , and corresponding eigenvectors, , such that
(4)
For given and , the set of all matrices of the form is called a pencil, and and are said to be an eigenvalue and eigenvector of the pencil .
When is nonsingular, equation (4) is mathematically equivalent to , and when is nonsingular, it is equivalent to . Thus, in theory, if one of the matrices or is known to be nonsingular, the problem could be reduced to a standard eigenvalue problem.
However, for this reduction to be satisfactory from the point of view of numerical stability, it is necessary not only that (or ) should be nonsingular, but that it should be well-conditioned with respect to inversion. The nearer is to singularity, the more unsatisfactory will be as a vehicle for determining the required eigenvalues. Well-determined eigenvalues of the original problem (4) may be poorly determined even by the correctly rounded version of .
We consider first a special class of problems in which is known to be nonsingular, and then return to the general case in the following sub-section.
If and are symmetric and is positive definite, then the generalized eigenvalue problem has desirable properties similar to those of the standard symmetric eigenvalue problem. The eigenvalues are all real, and the eigenvectors, while not orthogonal in the usual sense, satisfy the relations for and can be normalized so that .
Note that it is not enough for and to be symmetric; must also be positive definite, which implies nonsingularity. Eigenproblems with these properties are referred to as symmetric-definite problems.
If is the diagonal matrix whose diagonal elements are the eigenvalues, and is the matrix whose columns are the eigenvectors, then
To compute eigenvalues and eigenvectors, the problem can be reduced to a standard symmetric eigenvalue problem, using the Cholesky factorization of as or (see Chapter F07). Note, however, that this reduction does implicitly involve the inversion of , and hence this approach should not be used if is ill-conditioned with respect to inversion.
For example, with , we have
Hence the eigenvalues of are those of , where is the symmetric matrix and . The standard symmetric eigenproblem may be solved by the methods described in Section 2.1.1. The eigenvectors of the original problem may be recovered by computing .
Most of the functions which solve this class of problems can also solve the closely related problems
where again and are symmetric and is positive definite. See the function documents for details.
All the above remarks also apply – with the obvious changes – to the case when and are complex Hermitian matrices. Such problems are called Hermitian-definite. The eigenvectors are complex, but the eigenvalues are all real.
If and are large and sparse, reduction to an equivalent standard eigenproblem as described above would almost certainly result in a large dense matrix , and hence would be very wasteful in both storage and computing time. The methods of subspace iteration and Lanczos type methods, mentioned in Section 2.1.1, can also be used for large sparse generalized symmetric-definite problems.
2.3.2Generalized nonsymmetric eigenvalue problems
Any generalized eigenproblem which is not symmetric-definite with well-conditioned must be handled as if it were a general nonsymmetric problem.
If is singular, the problem has infinite eigenvalues. These are not a problem; they are equivalent to zero eigenvalues of the problem . Computationally they appear as very large values.
If and are both singular and have a common null space, then is singular for all ; in other words, any value can be regarded as an eigenvalue. Pencils with this property are called singular.
As with standard nonsymmetric problems, a real problem may have complex eigenvalues, occurring as complex conjugate pairs.
The generalized eigenvalue problem can be solved via the generalized Schur factorization of and :
where and are orthogonal, is upper triangular, and is upper quasi-triangular (defined just as in Section 2.1.2).
If all the eigenvalues are real, then is upper triangular; the eigenvalues are given by . If there are complex conjugate pairs of eigenvalues, then has diagonal blocks.
Eigenvalues and eigenvectors of a generalized nonsymmetric problem may be ill-conditioned; that is, sensitive to perturbations in or .
Particular care must be taken if, for some , , or in practical terms if and are both small; this means that the pencil is singular, or approximately so. Not only is the particular value undetermined, but also no reliance can be placed on any of the computed eigenvalues. See also the function documents.
Computation of eigenvalues and eigenvectors proceeds in the following stages.
1.The pencil is reduced by an orthogonal transformation to a pencil in which is upper Hessenberg and is upper triangular: and . The pencil has the same eigenvalues as , and is easier to handle.
2.The upper Hessenberg matrix is reduced to upper quasi-triangular form, while is maintained in upper triangular form, using the algorithm. This gives the generalized Schur factorization: and .
3.Eigenvectors of the pencil are computed (if required) by back-substitution, and pre-multiplied by to give eigenvectors of .
All the above remarks also apply – with the obvious changes – to the case when and are complex matrices. The eigenvalues are in general complex, so there is no need for special treatment of complex conjugate pairs, and the matrix in the generalized Schur factorization is simply a complex upper triangular matrix.
As for the generalized symmetric-definite eigenvalue problem, if and are large and sparse then it is generally preferable to use an alternative method. Chapter F12 provides functions based on Arnoldi's method for both real and complex matrices, intended to find a subset of the eigenvalues and vectors.
2.4Quadratic eigenvalue problems
Let , and be square matrices of order . The quadratic eigenvalue problem (QEP) is to find eigenvalues, , and corresponding eigenvectors, , such that
More specifically, is a right eigenvector and a left eigenvector, , is such that
where is the conjugate transpose of (transpose when is real).
In general the QEP has eigenvalues and corresponding eigenvectors.
QEPs are generally solved by linearizing the problem to produce a generalized eigenvalue problem. For example,
which is called the first companion form and has the same eigenvalues as the QEP.
If
then the QEP is said to be regular, or non-singular. For a regular QEP, when is singular the QEP has one or more zero eigenvalues and when is singular the QEP has one or more infinite eigenvalues. As with the generalized problem particular care must be taken when the problem is singular (see Section 2.3.2).
As with generalized nonsymmetric problems, a real QEP may have complex eigenvalues, occurring as complex conjugate pairs.
3Recommendations on Choice and Use of Available Functions
3.1Black Box Functions and General Purpose Functions
Functions in the NAG Library for solving eigenvalue problems fall into two categories.
1.Black Box Functions: these are designed to solve a standard type of problem in a single call – for example, to compute all the eigenvalues and eigenvectors of a real symmetric matrix. You are recommended to use a Black Box function if there is one to meet your needs; refer to the decision tree in Section 4.1 or the index in Section 5.
2.General Purpose Functions: these perform the computational subtasks which make up the separate stages of the overall task, as described in Section 2 – for example, reducing a real symmetric matrix to tridiagonal form. General purpose functions are to be found, for historical reasons, some in this chapter, a few in Chapter F01, but most in Chapter F08. If there is no Black Box function that meets your needs, you will need to use one or more general purpose functions.
Here are some of the more likely reasons why you may need to do this:
Your problem is already in one of the reduced forms – for example, your symmetric matrix is already tridiagonal.
You wish to economize on storage for symmetric matrices (see Section 3.3).
You wish to find selected eigenvalues or eigenvectors of a generalized symmetric-definite eigenproblem (see also Section 3.2).
The decision trees in Section 4.2 list the combinations of general purpose functions which are needed to solve many common types of problem.
Sometimes a combination of a Black Box function and one or more general purpose functions will be the most convenient way to solve your problem: the Black Box function can be used to compute most of the results, and a general purpose function can be used to perform a subsidiary computation, such as computing condition numbers of eigenvalues and eigenvectors.
3.2Computing Selected Eigenvalues and Eigenvectors
The decision trees and the function documents make a distinction between functions which compute all eigenvalues or eigenvectors, and functions which compute selected eigenvalues or eigenvectors; the two classes of function use different algorithms.
It is difficult to give clear guidance on which of these two classes of function to use in a particular case, especially with regard to computing eigenvectors. If you only wish to compute a very few eigenvectors, then a function for selected eigenvectors will be more economical, but if you want to compute a substantial subset (an old rule of thumb suggested more than 25%), then it may be more economical to compute all of them. Conversely, if you wish to compute all the eigenvectors of a sufficiently large symmetric tridiagonal matrix, the function for selected eigenvectors may be faster.
The choice depends on the properties of the matrix and on the computing environment; if it is critical, you should perform your own timing tests.
For dense nonsymmetric eigenproblems, there are no algorithms provided for computing selected eigenvalues; it is always necessary to compute all the eigenvalues, but you can then select specific eigenvectors for computation by inverse iteration.
3.3Storage Schemes for Symmetric Matrices
Functions which handle symmetric matrices are usually designed to use either the upper or lower triangle of the matrix; it is not necessary to store the whole matrix. If either the upper or lower triangle is stored conventionally in the upper or lower triangle of a two-dimensional array, the remaining elements of the array can be used to store other useful data. However, that is not always convenient, and if it is important to economize on storage, the upper or lower triangle can be stored in a one-dimensional array of length ; in other words, the storage is almost halved. This storage format is referred to as packed storage.
Functions designed for packed storage are usually less efficient, especially on high-performance computers, so there is a trade-off between storage and efficiency.
A band matrix is one whose nonzero elements are confined to a relatively small number of subdiagonals or superdiagonals on either side of the main diagonal. Algorithms can take advantage of bandedness to reduce the amount of work and storage required.
Functions which take advantage of packed storage or bandedness are provided for both standard symmetric eigenproblems and generalized symmetric-definite eigenproblems.
3.4Balancing for Nonsymmmetric Eigenproblems
There are two preprocessing steps which one may perform on a nonsymmetric matrix in order to make its eigenproblem easier. Together they are referred to as balancing.
1.Permutation: this involves reordering the rows and columns to make more nearly upper triangular (and thus closer to Schur form): , where is a permutation matrix. If has a significant number of zero elements, this preliminary permutation can reduce the amount of work required, and also improve the accuracy of the computed eigenvalues. In the extreme case, if is permutable to upper triangular form, then no floating-point operations are needed to reduce it to Schur form.
2.Scaling: a diagonal matrix is used to make the rows and columns of more nearly equal in norm: . Scaling can make the matrix norm smaller with respect to the eigenvalues, and so possibly reduce the inaccuracy contributed by roundoff (see Chapter II/11 of Wilkinson and Reinsch (1971)).
Functions are provided in Chapter F08 for performing either or both of these preprocessing steps, and also for transforming computed eigenvectors or Schur vectors back to those of the original matrix.
Black Box functions in this chapter which compute the Schur factorization perform only the permutation step, since diagonal scaling is not in general an orthogonal transformation. The Black Box functions which compute eigenvectors perform both forms of balancing.
3.5Non-uniqueness of Eigenvectors and Singular Vectors
Eigenvectors, as defined by equations (1) or (4), are not uniquely defined. If is an eigenvector, then so is where is any nonzero scalar. Eigenvectors computed by different algorithms, or on different computers, may appear to disagree completely, though in fact they differ only by a scalar factor (which may be complex). These differences should not be significant in any application in which the eigenvectors will be used, but they can arouse uncertainty about the correctness of computed results.
Even if eigenvectors are normalized so that , this is not sufficient to fix them uniquely, since they can still be multiplied by a scalar factor such that . To counteract this inconvenience, most of the functions in this chapter, and in Chapter F08, normalize eigenvectors (and Schur vectors) so that
and the component of with largest absolute value is real and positive. (There is still a possible indeterminacy if there are two components of equal largest absolute value – or in practice if they are very close – but this is rare.)
In symmetric problems the computed eigenvalues are sorted into ascending order, but in nonsymmetric problems the order in which the computed eigenvalues are returned is dependent on the detailed working of the algorithm and may be sensitive to rounding errors. The Schur form and Schur vectors depend on the ordering of the eigenvalues and this is another possible cause of non-uniqueness when they are computed. However, it must be stressed again that variations in the results from this cause should not be significant. (Functions in Chapter F08 can be used to transform the Schur form and Schur vectors so that the eigenvalues appear in any given order if this is important.)
In singular value problems, the left and right singular vectors and which correspond to a singular value cannot be normalized independently: if is multiplied by a factor such that , then must also be multiplied by .
Non-uniqueness also occurs among eigenvectors which correspond to a multiple eigenvalue, or among singular vectors which correspond to a multiple singular value. In practice, this is more likely to be apparent as the extreme sensitivity of eigenvectors which correspond to a cluster of close eigenvalues (or of singular vectors which correspond to a cluster of close singular values).
4Decision Trees
4.1Black Box Functions
The decision tree for this section is divided into three sub-trees.
Note: for the Chapter F08 functions there is generally a choice of simple and comprehensive function. The comprehensive functions return additional information such as condition and/or error estimates.
Tree 1: Eigenvalues and Eigenvectors of Real Matrices
As it is very unlikely that one of the functions in this section will be called on its own, the other functions required to solve a given problem are listed in the order in which they should be called.
4.2.2Singular Value Decomposition
See Section 4.1.2 in the F08 Chapter Introduction. For real sparse matrices where only selected singular values are required (possibly with their singular vectors), functions from Chapter F12 may be applied to the symmetric matrix ; see Section 10 in f12fbc.
Anderson E, Bai Z, Bischof C, Blackford S, Demmel J, Dongarra J J, Du Croz J J, Greenbaum A, Hammarling S, McKenney A and Sorensen D (1999) LAPACK Users' Guide (3rd Edition) SIAM, Philadelphia
Golub G H and Van Loan C F (1996) Matrix Computations (3rd Edition) Johns Hopkins University Press, Baltimore
Hammarling S, Munro C J and Tisseur F (2013) An algorithm for the complete solution of quadratic eigenvalue problems ACM Trans. Math. Software.39(3):18:1–18:119http://eprints.maths.manchester.ac.uk/2061/
Parlett B N (1998) The Symmetric Eigenvalue Problem SIAM, Philadelphia
Tisseur F and Meerbergen K (2001) The quadratic eigenvalue problem SIAM Review43:235–286
Wilkinson J H and Reinsch C (1971) Handbook for Automatic Computation II, Linear Algebra Springer–Verlag