NAG Library Chapter Introduction
F06 (blas)
Linear Algebra Support Routines
1
Scope of the Chapter
This chapter is concerned with basic linear algebra routines which perform elementary algebraic operations involving scalars, vectors and matrices. It includes routines which conform to the specifications of the BLAS (Basic Linear Algebra Subprograms).
2
Background to the Problems
A number of the routines in this chapter meet the specification of the
Basic Linear Algebra Subprograms (BLAS) as described in
Lawson et al. (1979),
Dodson et al. (1991),
Dongarra et al. (1988) and
Dongarra et al. (1990). The first reference describes a set of routines concerned with operations on scalars and vectors: these will be referred to here as the Level-0 and the Level-1
BLAS; the second reference describes a set of routines concerned with operations on sparse vectors: these will be referred to here as the Level-1
Sparse BLAS; the third reference describes a set of routines concerned with matrix-vector operations: these will be referred to here as the Level-2 BLAS;
and the fourth reference describes a set of routines concerned with matrix-matrix operations: these will be referred to here as the Level-3 BLAS.
More generally we refer to the scalar routines in the chapter as
Level-0 routines, to the vector routines as Level-1 routines, to the matrix-vector and matrix routines as Level-2 routines, and to the matrix-matrix routines as Level-3 routines. The terminology reflects the number of operations involved. For example, a Level-2 routine
involves operations for an matrix.
2.1
The Use of BLAS Names
Many of the routines in other chapters of the Library call the routines in this chapter, and in particular a number of the BLAS are called. These routines are usually called by the BLAS name and so, for correct operation of the Library, it is essential that you do not attempt to link your own versions of these routines. If you are in any doubt about how to avoid this, please consult your computer centre or the NAG
Response Centre.
The BLAS names are used in order to make use of efficient implementations of the routines when these exist. Such implementations are stringently tested before being used, to ensure that they correctly meet the specification of the BLAS, and that they return the desired accuracy (see, for example,
Dodson et al. (1991),
Dongarra et al. (1988) and
Dongarra et al. (1990)).
2.2
Background Information
Most of the routines in this chapter implement straightforward scalar,
vector and matrix operations that need no further explanation beyond a statement of the purpose of the routine. In this section we give some additional background information to those few cases where additional explanation may be necessary. A sub-section is devoted to each topic.
2.2.1
Real plane rotations
There are a number of routines in the chapter concerned with setting up and applying plane rotations. This section discusses the real case and the next section looks at the complex case. For further background information see
Golub and Van Loan (1996).
A plane rotation matrix for the
plane,
,
is an orthogonal matrix that is different from the unit matrix only in the elements
,
,
and
. If we put
then, in the real case, it is usual to choose
so that
An exception is routine
f06fpf which applies the so-called symmetric rotation for which
The application of plane rotations is straightforward and needs no further elaboration, so further comment is made only on the construction of plane rotations.
The most common use of plane rotations is to choose
and
so that for given
and
,
In such an application the matrix
is often termed a
Givens rotation matrix. There are two approaches to the construction of real Givens rotations in
Chapter F06.
The BLAS routine
f06aaf (drotg),
see
Lawson et al. (1979) and
Dodson and Grimes (1982),
computes
,
and
as
where
The value
defined as
is also computed and this enables
and
to be reconstructed from the single value
as
The other
Chapter F06 routines for constructing Givens rotations are based on the computation of the tangent,
.
is computed as
where
and
is the small positive value returned by
x02amf. The values of
and
are then computed or reconstructed via
as
where
is the
machine precision. Note that
is always non-negative in this scheme and that the same expressions are used in the initial computation of
and
from
and
as in any subsequent recovery of
and
via
. This is the approach used by many of the NAG Library routines that require plane rotations.
is computed simply as
You need not be too concerned with the above detail, since routines are provided for setting up, recovering and applying such rotations.
Another use of plane rotations is to choose
and
so that for given
,
and
In such an application the matrix
is often termed a
Jacobi rotation matrix. The routine that generates a Jacobi rotation (
f06bef) first computes the tangent
and then computes
and
via
as described above for the Givens rotation.
2.2.2
Complex plane rotations
In the complex case a plane rotation matrix for the
plane,
is a unitary matrix and, analogously to the real case,
it is usual to choose
so that
where
denotes the complex conjugate of
.
The BLAS (see
Lawson et al. (1979)) do not contain a routine for the generation of complex rotations, and so the routines in
Chapter F06
are all based upon computing
and
via
in an analogous manner to the real case.
can be chosen to have either
real,
or
real and there are routines for both cases.
When
is real then it is non-negative and the transformation
is such that if
is real then
is also real.
When
is real then the transformation
is such that if
is real then
is also real.
2.2.3
Elementary real (Householder) reflections
There are a number of routines in the chapter concerned with setting up and applying Householder transformations. This section discusses the real case and the next section looks at the complex case. For further background information see
Golub and Van Loan (1996).
A real elementary reflector,
, is a matrix of the form
where
is a scalar and
is a vector,
and
is both symmetric and orthogonal. In the routines in
Chapter F06,
is expressed in the form
because in many applications
and
are not contiguous elements. The usual use of elementary reflectors is to choose
and
so that for given
and
Such a transformation is often termed a
Householder transformation. There are two choices of
and
available in
Chapter F06.
The first form of the Householder transformation is compatible with that used by LINPACK (see
Dongarra et al. (1979)) and has
This choice makes
satisfy
The second form, and the form used by many of the NAG Library
routines, has
which makes
In both cases the special setting
is used by the routines to flag the case where
.
Note that while there are routines to apply an elementary reflector to a vector, there are no routines available in
Chapter F06 to apply an elementary reflector to a matrix. This is because such transformations can readily and efficiently be achieved by calls to the matrix-vector
Level 2 BLAS routines. For example, to form
for a given matrix
and so we can call a matrix-vector product routine to form
and then call a rank-one update routine to form
. Of course, we must skip the transformation when
has been set to zero.
2.2.4
Elementary complex (Householder) reflections
A complex elementary reflector,
, is a matrix of the form
where
denotes the complex conjugate of
, and
is both Hermitian and unitary. For convenience in a number of applications this definition can be generalized slightly by allowing
to be complex and so defining the generalized elementary reflector as
for which
is still unitary, but is no longer Hermitian.
The
Chapter F06 routines choose
and
so that
and this reduces to
(12) with the choice
(16) when
and
are real. This choice is used because
and
can now be chosen so that in the Householder transformation
(14) we can make
and, as in the real case,
Rather than returning
and
as separate arguments the
Chapter F06 routines return the single complex value
defined as
Obviously
and
can be recovered as
The special setting
is used to flag the case where
, and
is used to flag the case where
and in this case
actually contains the value of
. Notice that with both
(18) and
(21) we merely have to supply
rather than
in order to represent
.
3
Recommendations on Choice and Use of Available Routines
3.1
Naming Scheme
3.1.1
NAG names
Table 1 shows the naming scheme for the routines in this chapter.
|
|
Level-0 |
Level-1 |
Level-2 |
Level-3 |
integer |
Chapter F06 routine |
– |
F06D_F |
– |
– |
‘real’ |
BLAS routine |
F06A_F |
F06E_F |
F06P_F |
F06Y_F |
‘real’ |
Chapter F06 routine |
F06B_F |
F06F_F |
F06Q_F |
– |
|
|
|
|
F06R_F |
|
‘complex’ |
BLAS routine |
– |
F06G_F |
F06S_F |
F06Z_F |
‘complex’ |
Chapter F06 routine |
F06C_F |
F06H_F |
F06T_F |
– |
|
|
|
|
F06U_F |
|
‘mixed type’ |
BLAS routine |
– |
F06J_F |
– |
– |
‘mixed type’ |
Chapter F06 routine |
– |
F06K_F |
F06V_F |
– |
‘real’ and ‘complex’ |
LAPACK routines |
– |
– |
F06W_F |
F06W_F |
Table 1
The heading ‘mixed type’ is for routines where a mixture of data types is involved, such as a routine that returns the real Euclidean length of a complex vector. In future marks of the Library, routines may be included in categories that are currently empty and further categories may be introduced.
3.1.2
BLAS names
Those routines which conform to the specifications of the BLAS may be called either by their NAG names or by their BLAS names.
In many implementations of the NAG Library, references to BLAS names may be linked to an efficient machine-specific implementation of the BLAS, usually provided by the vendor of the machine. Such implementations are stringently tested before being used with the NAG Library, to ensure that they correctly meet the specifications of the BLAS, and that they return the desired accuracy. Use of BLAS names is recommended for efficiency.
References to NAG routine names (beginning F06-) are always linked to the code provided in the NAG Library and may be significantly slower than the equivalent BLAS routine.
The names of the Level-2 and Level-3 BLAS follow a simple scheme (which is similar to that used for LAPACK routines in
Chapters F07 and
F08). Each name has the structure
XYYZZZ, where the components have the following meanings:
– |
the initial letter X indicates the data type (real or complex) and precision:
S |
real, single precision (in Fortran, REAL) |
D |
real, double precision (in Fortran, DOUBLE PRECISION) |
C |
complex, single precision (in Fortran, COMPLEX) |
Z |
complex, double precision (in Fortran, COMPLEX*16 or DOUBLE COMPLEX) |
|
– |
the second and third letters YY indicate the type of the matrix (and in some cases its storage scheme):
GE |
general |
GB |
general band |
SY |
symmetric |
SP |
symmetric (packed storage) |
SB |
symmetric band |
HE |
(complex) Hermitian |
HP |
(complex) Hermitian (packed storage) |
HB |
(complex) Hermitian band |
TR |
triangular |
TP |
triangular (packed storage) |
TB |
triangular band |
|
– |
the remaining , or letters ZZZ indicate the computation performed:
MV |
matrix-vector product |
MM |
matrix-matrix product |
R |
rank-1 update |
R2 |
rank-2 update |
RK |
rank- update |
R2K |
rank- update |
SV |
solve a system of linear equations |
SM |
solve a system of linear equations with a matrix of right-hand sides |
|
Thus the routine
dgemv performs a
matrix-vector product involving a real general matrix in double precision;
the corresponding routine for a complex general matrix is
zgemv.
The names of the Level-1 BLAS mostly follow the same convention for the initial letter (S-, C-, D- or Z-), except for a few involving data of mixed type, where the first two characters are precision-dependent.
3.1.3
LAPACK names
There are some LAPACK routines in this chapter that have BLAS-like functionalty. Four are equivalent to BLAS routines but for matrices stored in Rectangular Full Packed (RFP) format. The naming convention for these is as above with the addition of the matrix types:
HF |
(complex) Hermitian (RFP storage) |
TF |
triangular (RFP storage) |
SF |
symmetric (RFP storage) |
There are an additonal two that compute norms of RFP matrices. These have second and third letters LA (signifying LAPACK), fourth letter N (signifying norm), and fifth and sixth letter signifying matrix type as above. For example
zlanhf computes the norm of a Hermitian matrix in RFP format.
3.2
The Level-0 Scalar Routines
The Level-0 routines perform operations on scalars or on vectors or matrices of order .
3.3
The Level-1 Vector Routines
The Level-1 routines perform operations either on a single vector or on a pair of vectors.
3.4
The Level-2 Matrix-vector and Matrix Routines
The Level-2 routines perform operations involving either a matrix on its own, or a matrix and one or more vectors.
3.5
The Level-3 Matrix-matrix Routines
The Level-3 routines perform operations involving matrix-matrix products.
3.6
Vector Arguments
Vector arguments (except in the Level-1 Sparse BLAS) are represented by a one-dimensional array, immediately followed by an increment argument whose name consists of the three characters INC followed by the name of the array. For example, a vector is represented by the two arguments x and incx. The length of the vector, say, is passed as a separate argument,
n.
The increment argument is the spacing (stride) in the array between the elements of the vector. For instance, if ,
then the elements of are in locations of the array
x
and the intermediate locations are not referenced.
When , the vector element is in the array element . When , the elements are stored in the reverse order so that the vector element is in the array element and hence, in particular, the element is in . The declared length of the array x in the calling subroutine must be at least .
Negative increments are permitted only for:
- Level-1 routines which have more than one vector argument;
- Level-2 BLAS routines (but not for other Level-2 routines)
Zero increments are formally permitted for Level-1 routines with more than one argument (in which case the element is accessed repeatedly), but their use is strongly discouraged since the effect may be implementation-dependent. There is usually an alternative routine in this chapter, with a simplified argument list, to achieve the required purpose. Zero increments are not permitted in the Level-2 BLAS.
In the Level-1 Sparse BLAS, each routine operates on two vectors
and
. The vector
is stored as a compressed sparse vector, and is represented by the three arguments
nz,
x and
indx;
nz is the number of ‘interesting’ (usually nonzero) elements of
, and
indx is a one-dimensional
index array such that
The (mathematical) length of the vector,
say, does not need to be supplied; it is assumed that
. For example, the vector
could be represented with
,
,
. The second vector
is stored conventionally, and is represented simply by the one-dimensional array
y, with
in
; the increment is assumed to be
. Only the elements
are referenced.
Non-positive values of
nz are permitted, in which case the routines return immediately — except that functions set their value to zero before returning. For those routines where
y is an output argument the values in the array indx must be distinct; violating this condition may yield incorrect results.
3.7
Matrix Arguments and Storage Schemes
In this chapter the following different storage schemes are used for matrices:
- – conventional storage in a two-dimensional array;
- – packed and RFP storage for symmetric, Hermitian or triangular matrices;
- – band storage for band matrices;
- – storage for spiked matrices.
These storage schemes are compatible with those used in
Chapters F07 and
F08. (Different schemes for packed or band storage are used in a few older routines in
Chapters F01,
F02,
F03 and
F04.)
Chapter F01 provides some utility routines for conversion between storage schemes.
In the examples, indicates an array element which need not be set and is not referenced by the routines. The examples illustrate only the relevant leading rows and columns of the arrays; array arguments may of course have additional rows or columns, according to the usual rules for passing array arguments in Fortran.
3.7.1
Conventional storage
Please see
Section 3.3.1 in the F07 Chapter Introduction for full details.
3.7.2
Packed storage
Please see
Section 3.3.2 in the F07 Chapter Introduction for full details.
3.7.3
Rectangular Full Packed (RFP) storage
Please see
Section 3.3.3 in the F07 Chapter Introduction for full details.
3.7.4
Band storage
Please see
Section 3.3.4 in the F07 Chapter Introduction for full details.
3.7.5
Unit triangular matrices
Please see
Section 3.3.5 in the F07 Chapter Introduction for full details.
3.7.6
Real diagonal elements of complex Hermitian matrices
Please see
Section 3.3.6 in the F07 Chapter Introduction for full details.
3.7.7
Spiked matrices
A few routines in this chapter
(
f06qsf,
f06qwf,
f06tsf and
f06twf)
deal with
upper spiked matrices. These are upper triangular matrices with an additional nonzero row or column below the diagonal.
The position of the spike is defined by indices
and
; it is assumed that
. A
row spike has nonzero elements in the
th row,
for
; a
column spike has nonzero elements in the
th column,
for
. For example, when
,
and
:
Row spike |
Column spike |
|
|
The storage scheme adopted by the routines in this chapter is for the upper triangular part of the spiked matrix to be stored conventionally in a two-dimensional array a, with the subdiagonal elements of the spike stored in a separate vector.
3.8
Option Parameters
Many of the routines in this chapter have one or more option arguments, of type CHARACTER. The descriptions in the routine documents refer only to upper-case values (for example
or
); however, in every case, the corresponding lower-case characters may be supplied (with the same meaning). Any other value is illegal.
A longer character string can be passed as the actual argument, making the calling program more readable, but only the first character is significant. (This is a feature of Fortran.) For example:
Call dtrsv('Upper','Transpose','Non-unit',...)
The following option arguments are used in this chapter:
- If , operate with the matrix (Not transposed);
- if , operate with the Transpose of the matrix;
- if , operate with the Conjugate transpose of the matrix.
- If , upper triangle or trapezoid of matrix;
- if , lower triangle or trapezoid of matrix.
- If , unit triangular;
- if , nonunit triangular.
- If , operate from the left-hand side;
- if , operate from the right-hand side.
- If , variable pivot (in applying a sequence of plane rotations);
- if , bottom pivot;
- if , top pivot;
- if , fixed pivot.
- If , backward sequence of plane rotations;
- if , forward sequence of plane rotations.
- If or , -norm of a matrix;
- if , -norm of a matrix;
- if or , Frobenius or Euclidean norm of a matrix;
- if , maximum absolute value of the elements of a matrix (not strictly a norm).
- If , general (rectangular or square) matrix;
- if , upper trapezoidal or triangular matrix;
- if , lower trapezoidal or triangular matrix.
- if , matrix stored in normal RFP format (Not transposed).
- if , transpose of the matrix stored in RFP format.
- if , conjugate transpose of the matrix stored in RFP format.
3.8.1
Matrix norms
The option argument
norm
specifies different matrix norms whose definitions are given here for reference (for a general
by
matrix
):
- One-norm ( or ):
- Infinity-norm ():
- Frobenius or Euclidean norm ( or ):
If is symmetric or Hermitian, .
The argument
norm
can also be used to specify the maximum absolute value (if
),
but this is not a norm in the strict mathematical sense.
3.9
Error Handling
Routines in this chapter do not use the usual NAG Library error-handling mechanism, involving the argument IFAIL.
If one of the Level-2 or Level-3 BLAS routines is called with an invalid value of one of its arguments, then an error message is output on the error message unit (see
x04aaf), giving the name of the routine and the number of the first invalid argument, and execution of the program is terminated. The following values of arguments are invalid:
- – any value of the character arguments
trans,
transa,
transb,
uplo,
side or
diag, whose meaning is not specified;
- – a negative value of any of the arguments
m,
n,
k,
kl or
ku;
- – too small a value for any of the leading dimension arguments;
- – a zero value for the increment arguments
incx and
incy.
Zero values for the matrix dimensions
m,
n or
k are considered valid.
The other routines in this chapter do not report any errors in their arguments. Normally, if called, for example, with an unspecified value for one of the option arguments, or with a negative value of one of the problem dimensions
m or
n, they simply do nothing and return immediately.
4
Functionality Index
Level 0 (Scalar) operations, | | |
apply similarity rotation to 2 by 2 Hermitian matrix | | f06chf |
generate a plane rotation, storing the tangent, real cosine | | f06caf |
generate a plane rotation, storing the tangent, real sine | | f06cbf |
quotient of two numbers, with overflow flag | | f06clf |
recover cosine and sine from given tangent, real cosine | | f06ccf |
recover cosine and sine from given tangent, real sine | | f06cdf |
apply similarity rotation to 2 by 2 symmetric matrix | | f06bhf |
compute Euclidean norm from scaled form | | f06bmf |
eigenvalue of 2 by 2 symmetric matrix | | f06bpf |
generate a Jacobi plane rotation | | f06bef |
generate a plane rotation storing the tangent | | f06baf |
quotient of two numbers, with overflow flag | | f06blf |
recover cosine and sine from given tangent | | f06bcf |
Level 1 (Vector) operations, | | |
apply a complex plane rotation | | f06hpf |
apply an elementary reflection to a vector | | f06htf |
broadcast a scalar into a vector | | f06hbf |
copy a real vector to a complex vector | | f06kff |
generate an elementary reflection | | f06hrf |
generate a sequence of plane rotations | | f06hqf |
multiply vector by a complex scalar, preserving input vector | | f06hdf |
multiply vector by a real scalar, preserving input vector | | f06kdf |
multiply vector by complex diagonal matrix | | f06hcf |
multiply vector by real diagonal matrix | | f06kcf |
multiply vector by reciprocal of a real scalar | | f06kef |
update Euclidean norm in scaled form | | f06kjf |
broadcast a scalar into a vector | | f06dbf |
apply an elementary reflection to a vector (Linpack style) | | f06fuf |
apply an elementary reflection to a vector (NAG style) | | f06ftf |
apply a symmetric plane rotation to two vectors | | f06fpf |
broadcast a scalar into a vector | | f06fbf |
cosine of angle between two vectors | | f06faf |
elements of largest and smallest absolute value | | f06flf |
generate an elementary reflection (Linpack style) | | f06fsf |
generate an elementary reflection (NAG style) | | f06frf |
generate a sequence of plane rotations | | f06fqf |
index of last non-negligible element | | f06klf |
multiply vector by a scalar, preserving input vector | | f06fdf |
multiply vector by diagonal matrix | | f06fcf |
multiply vector by reciprocal of a scalar | | f06fef |
update Euclidean norm in scaled form | | f06fjf |
weighted Euclidean norm of a vector | | f06fkf |
Level 2 (Matrix-vector and matrix) operations, | | |
complex matrix and vector(s), | | |
apply sequence of plane rotations to a rectangular matrix, | | |
complex cosine, real sine | | f06tyf |
real cosine, complex sine | | f06txf |
compute a norm or the element of largest absolute value, | | |
Hermitian matrix, packed form | | f06udf |
Hermitian tridiagonal matrix | | f06upf |
symmetric matrix, packed form | | f06ugf |
triangular matrix, packed form | | f06ukf |
compute upper Hessenberg matrix by applying sequence of plane rotations to an upper triangular matrix | | f06tvf |
compute upper spiked matrix by applying sequence of plane rotations to an upper triangular matrix | | f06twf |
symmetric packed matrix | | f06tcf |
permute rows or columns of a matrix, | | |
permutations represented by an integer array | | f06vjf |
permutations represented by a real array | | f06vkf |
QR factorization by sequence of plane rotations, | | |
of rank-1 update of upper triangular matrix | | f06tpf |
of upper triangular matrix augmented by a full row | | f06tqf |
QR factorization of UZ or RQ factorization of ZU, where U is upper triangular and Z is a sequence of plane rotations | | f06ttf |
QR or RQ factorization by sequence of plane rotations, | | |
of upper Hessenberg matrix | | f06trf |
symmetric packed matrix | | f06tdf |
matrix copy, rectangular or trapezoidal | | f06tff |
solution of a system of equations, | | |
unitary similarity transformation of a Hermitian matrix, | | |
as sequence of plane rotations | | f06tmf |
real matrix and vector(s), | | |
apply sequence of plane rotations to a rectangular matrix | | f06qxf |
compute a norm or the element of largest absolute value, | | |
symmetric matrix, packed form | | f06rdf |
symmetric tridiagonal matrix | | f06rpf |
triangular matrix, packed form | | f06rkf |
compute upper Hessenberg matrix by applying sequence of plane rotations to an upper triangular matrix | | f06qvf |
compute upper spiked matrix by applying sequence of plane rotations to an upper triangular matrix | | f06qwf |
orthogonal similarity transformation of a symmetric matrix, | | |
as sequence of plane rotations | | f06qmf |
permute rows or columns of a matrix, | | |
permutations represented by an integer array | | f06qjf |
permutations represented by a real array | | f06qkf |
QR factorization by sequence of plane rotations, | | |
of rank-1 update of upper triangular matrix | | f06qpf |
of upper triangular matrix augmented by a full row | | f06qqf |
QR factorization of UZ or RQ factorization of ZU, where U is upper triangular and Z is a sequence of plane rotations | | f06qtf |
QR or RQ factorization by sequence of plane rotations, | | |
of upper Hessenberg matrix | | f06qrf |
matrix copy, rectangular or trapezoidal | | f06qff |
solution of a system of equations, | | |
Level 3 (Matrix-matrix) operations, | | |
solution of triangular systems of equations, RFP format | | f06wpf (ztfsm) |
solution of triangular systems of equations, RFP format | | f06wbf (dtfsm) |
Sparse level 1 (vector) operations, | | |
5
Auxiliary Routines Associated with Library Routine Arguments
None.
6
Routines Withdrawn or Scheduled for Withdrawal
None.
7
References
Dodson D S and Grimes R G (1982) Remark on Algorithm 539 ACM Trans. Math. Software 8 403–404
Dodson D S, Grimes R G and Lewis J G (1991) Sparse extensions to the Fortran basic linear algebra subprograms ACM Trans. Math. Software 17 253–263
Dongarra J J, Du Croz J J, Duff I S and Hammarling S (1990) A set of Level 3 basic linear algebra subprograms ACM Trans. Math. Software 16 1–28
Dongarra J J, Du Croz J J, Hammarling S and Hanson R J (1988) An extended set of FORTRAN basic linear algebra subprograms ACM Trans. Math. Software 14 1–32
Dongarra J J, Moler C B, Bunch J R and Stewart G W (1979) LINPACK Users' Guide SIAM, Philadelphia
Golub G H and Van Loan C F (1996) Matrix Computations (3rd Edition) Johns Hopkins University Press, Baltimore
Lawson C L, Hanson R J, Kincaid D R and Krogh F T (1979) Basic linear algebra supbrograms for Fortran usage ACM Trans. Math. Software 5 308–325