f04qaf : NAG Library, Mark 26

f04qaf solves sparse nonsymmetric equations, sparse linear least squares problems and sparse damped linear least squares problems, using a Lanczos algorithm.

Fortran Interface

Subroutine f04qaf (

m, n, b, x, se, aprod, damp, atol, btol, conlim, itnlim, msglvl, itn, anorm, acond, rnorm, arnorm, xnorm, work, ruser, lruser, iuser, liuser, inform, ifail)

Integer, Intent (In)	::	m, n, msglvl, lruser, liuser
Integer, Intent (Inout)	::	itnlim, iuser(liuser), ifail
Integer, Intent (Out)	::	itn, inform
Real (Kind=nag_wp), Intent (In)	::	damp, atol, btol, conlim
Real (Kind=nag_wp), Intent (Inout)	::	b(m), ruser(lruser)
Real (Kind=nag_wp), Intent (Out)	::	x(n), se(n), anorm, acond, rnorm, arnorm, xnorm, work(n,2)
External	::	aprod

C Header Interface

#include nagmk26.h

void

f04qaf_ ( const Integer *m, const Integer *n, double b[], double x[], double se[],
void (NAG_CALL *aprod)( Integer *mode, const Integer *m, const Integer *n, double x[], double y[], double ruser[], const Integer *lruser, Integer iuser[], const Integer *liuser),
const double *damp, const double *atol, const double *btol, const double *conlim, Integer *itnlim, const Integer *msglvl, Integer *itn, double *anorm, double *acond, double *rnorm, double *arnorm, double *xnorm, double work[], double ruser[], const Integer *lruser, Integer iuser[], const Integer *liuser, Integer *inform, Integer *ifail)

f04qaf can be used to solve a system of linear equations

A x = b

(1)

where

A

is an

n

by

n

sparse nonsymmetric matrix, or can be used to solve linear least squares problems, so that f04qaf minimizes the value

ρ

given by

ρ = ‖r‖, r = b - A x

(2)

where

A

is an

m

by

n

sparse matrix and

‖r‖

denotes the Euclidean length of

r

so that

{‖r‖}^{2} = r^{T} r

. A damping argument,

λ

, may be included in the least squares problem in which case f04qaf minimizes the value

ρ

given by

ρ^{2} = {‖r‖}^{2} + λ^{2} {‖x‖}^{2} .

(3)

λ

is supplied as the argument damp and should of course be zero if the solution to problems (1) or (2) is required. Minimizing

ρ

in (3) is often called ridge regression.

f04qaf is based upon algorithm LSQR (see Paige and Saunders (1982a) and Paige and Saunders (1982b)) and solves the problems by an algorithm based upon the Lanczos process. The routine does not require

A

explicitly, but

A

is specified via aprod which must perform the operations

(y + A x)

and

(x + A^{T} y)

for a given

n

-element vector

x

and

m

element vector

y

. A argument to aprod specifies which of the two operations is required on a given entry.

The routine also returns estimates of the standard errors of the sample regression coefficients (

x_{i}

, for

i = 1, 2, \dots, n

) given by the diagonal elements of the estimated variance-covariance matrix

V

. When problem (2) is being solved and

A

is of full rank, then

V

is given by

V = s^{2} {(A^{T} A)}^{- 1}, s^{2} = ρ^{2} / (m - n), m > n

and when problem (3) is being solved then

V

is given by

V = s^{2} {(A^{T} A + λ^{2} I)}^{- 1}, s^{2} = ρ^{2} / m, λ \neq 0 .

Let

\bar{A}

denote the matrix

\bar{A} = A, λ = 0; \bar{A} = (\begin{matrix} A \\ λ I \end{matrix}), λ \neq 0,

(4)

let

\bar{r}

denote the residual vector

\bar{r} = r, λ = 0; \bar{r} = (\begin{matrix} b \\ 0 \end{matrix}) - \bar{A} x, λ \neq 0

(5)

corresponding to an iterate

x

, so that

ρ = ‖\bar{r}‖

is the function being minimized, and let

‖A‖

denote the Frobenius (Euclidean) norm of

A

. Then the routine accepts

x

as a solution if it is estimated that one of the following two conditions is satisfied:

ρ \leq {tol}_{1} ‖\bar{A}‖ . ‖x‖ + {tol}_{2} ‖b‖

(6)

‖{\bar{A}}^{T} \bar{r}‖ \leq {tol}_{1} ‖\bar{A}‖ ρ

(7)

where

{tol}_{1}

and

{tol}_{2}

are user-supplied tolerances which estimate the relative errors in

A

and

b

respectively. Condition (6) is appropriate for compatible problems where, in theory, we expect the residual to be zero and will be satisfied by an acceptable solution

x

to a compatible problem. Condition (7) is appropriate for incompatible systems where we do not expect the residual to be zero and is based on the observation that, in theory,

{\bar{A}}^{T} \bar{r} = 0

when

x

is a solution to the least squares problem, and so (7) will be satisfied by an acceptable solution

x

to a linear least squares problem.

The routine also includes a test to prevent convergence to solutions,

x

, with unacceptably large elements. This can happen if

A

is nearly singular or is nearly rank deficient. If we let the singular values of

\bar{A}

be

σ_{1} \geq σ_{2} \geq \dots \geq σ_{n} \geq 0

then the condition number of

\bar{A}

is defined as

cond (\bar{A}) = σ_{1} / σ_{k}

where

σ_{k}

is the smallest nonzero singular value of

\bar{A}

and hence

k

is the rank of

\bar{A}

. When

k < n

, then

\bar{A}

is rank deficient, the least squares solution is not unique and f04qaf will normally converge to the minimal length solution. In practice

\bar{A}

will not have exactly zero singular values, but may instead have small singular values that we wish to regard as zero.

The routine provides for this possibility by terminating if

cond (\bar{A}) \geq c_{\lim}

(8)

where

c_{\lim}

is a user-supplied limit on the condition number of

\bar{A}

. For problem (1) termination with this condition indicates that

A

is nearly singular and for problem (2) indicates that

A

is nearly rank deficient and so has near linear dependencies in its columns. In this case inspection of

‖r‖

,

‖A^{T} r‖

and

‖x‖

, which are all returned by the routine, will indicate whether or not an acceptable solution has been found. Condition (8), perhaps in conjunction with

λ \neq 0

, can be used to try and ‘regularize’ least squares solutions. A full discussion of the stopping criteria is given in Section 6 of Paige and Saunders (1982a).

Introduction of a nonzero damping argument

λ

tends to reduce the size of the computed solution and to make its components less sensitive to changes in the data, and f04qaf is applicable when a value of

λ

is known a priori. To have an effect,

λ

should normally be at least

\sqrt{ε} ‖A‖

where

ε

is the machine precision. For further discussion see Paige and Saunders (1982b) and the references given there.

Whenever possible the matrix

A

should be scaled so that the relative errors in the elements of

A

are all of comparable size. Such a scaling helps to prevent the least squares problem from being unnecessarily sensitive to data errors and will normally reduce the number of iterations required. At the very least, in the absence of better information, the columns of

A

should be scaled to have roughly equal column length.

Paige C C and Saunders M A (1982a) LSQR: An algorithm for sparse linear equations and sparse least squares ACM Trans. Math. Software 8 43–71

Paige C C and Saunders M A (1982b) Algorithm 583 LSQR: Sparse linear equations and least squares problems ACM Trans. Math. Software 8 195–209

If on entry

ifail = 0

or

- 1

, explanatory error messages are output on the current error message unit (as defined by x04aaf).

Errors or warnings detected by the routine:

When the problem is compatible, the computed solution

x

will satisfy the equation

r = b - A x,

where an estimate of

‖r‖

is returned in the argument rnorm. When the problem is incompatible, the computed solution

x

will satisfy the equation

{\bar{A}}^{T} \bar{r} = e,

where an estimate of

‖e‖

is returned in the argument arnorm. See also Section 6.2 of Paige and Saunders (1982b).

f04qaf makes calls to BLAS and/or LAPACK routines, which may be threaded within the vendor library used by this implementation. Consult the documentation for the vendor library for further information.

Please consult the X06 Chapter Introduction for information on how to control and interrogate the OpenMP environment used within this routine. Please also consult the Users' Note for your implementation for any additional implementation-specific information.

The time taken by f04qaf is likely to be principally determined by the time taken in aprod, which is called twice on each iteration, once with

mode = 1

and once with

mode = 2

. The time taken per iteration by the remaining operations in f04qaf is approximately proportional to

\max (m, n)

.

The Lanczos process will usually converge more quickly if

A

is pre-conditioned by a nonsingular matrix

M

that approximates

A

in some sense and is also chosen so that equations of the form

M y = c

can efficiently be solved for

y

. For a discussion of preconditioning, see the F11 Chapter Introduction. In the context of f04qaf, problem (1) is equivalent to

(A M^{- 1}) y = b, M x = y

and problem (2) is equivalent to minimizing

ρ = ‖r‖, r = b - (A M^{- 1}) y, M x = y .

Note that the normal matrix

{(A M^{- 1})}^{T} (A M^{- 1}) = M^{- T} (A^{T} A) M^{- 1}

so that the preconditioning

A M^{- 1}

is equivalent to the preconditioning

M^{- T} (A^{T} A) M^{- 1}

of the normal matrix

A^{T} A

.

Pre-conditioning can be incorporated into f04qaf simply by coding aprod to compute

y + A M^{- 1} x

and

x + M^{- T} A^{T} y

in place of

y + A x

and

x + A^{T} y

respectively, and then solving the equations

M x = y

for

x

on return from f04qaf. The quantity

y + A M^{- 1} x

should be computed by solving

M z = x

for

z

and then computing

y + A z

, and

x + M^{- T} A^{T} y

should be computed by solving

M^{T} z = A^{T} y

for

z

and then forming

x + z

.

When

msglvl > 0

, then f04qaf will produce output (except in the case where the routine fails with

ifail = 1

) on the advisory message channel (see x04abf).

When

msglvl \geq 2

then a summary line is printed periodically giving the following information:

Output	Meaning
ITN	Iteration number, $k$ .
X(1)	The first element of the current iterate $x_{k}$ .
FUNCTION	The current value of the function, $ρ$ , being minimized.
COMPAT	An estimate of $‖{\bar{r}}_{k}‖ / ‖b‖$ , where ${\bar{r}}_{k}$ is the residual corresponding to $x_{k}$ . This value should converge to zero (in theory) if and only if the problem is compatible. COMPAT decreases monotonically.
INCOMPAT	An estimate of $‖{\bar{A}}^{T} {\bar{r}}_{k}‖ / (‖\bar{A}‖ ‖{\bar{r}}_{k}‖)$ which should converge to zero if and only if at the solution $ρ$ is nonzero. INCOMPAT is not usually monotonic.
NRM(ABAR)	A monotonically increasing estimate of $‖\bar{A}‖$ .
COND(ABAR)	A monotonically increasing estimate of the condition number $cond (\bar{A})$ .

This example solves the linear least squares problem

\min ρ = ‖r‖, r = b - A x

where

A

is the

13

by

12

matrix and

b

is the

13

element vector given by

A = (\begin{array}{r} 1 & 0 & 0 & - 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 & - 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 1 & - 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ - 1 & 0 & - 1 & 4 & - 1 & 0 & 0 & - 1 & 0 & 0 & 0 & 0 \\ 0 & - 1 & 0 & - 1 & 4 & - 1 & 0 & 0 & - 1 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & - 1 & 1 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 1 & - 1 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & - 1 & 0 & 0 & - 1 & 4 & - 1 & 0 & - 1 & 0 \\ 0 & 0 & 0 & 0 & - 1 & 0 & 0 & - 1 & 4 & - 1 & 0 & - 1 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & - 1 & 1 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & - 1 & 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & - 1 & 0 & 0 & 1 \\ 1 & 1 & 1 & 0 & 0 & 1 & 1 & 0 & 0 & 1 & 1 & 1 \end{array}), b = - h^{2} (\begin{matrix} 0 \\ 0 \\ 0 \\ 1 \\ 1 \\ 0 \\ 0 \\ 1 \\ 1 \\ 0 \\ 0 \\ 0 \\ - h^{- 3} \end{matrix})

with

h = 0.1

.

Such a problem can arise by considering the Neumann problem on a rectangle

\begin{matrix} \frac{δ u}{δ n} = 0 \\ \frac{δ u}{δ n} = 0 & \begin{matrix} \nabla^{2} u = g (x, y) \end{matrix} & \frac{δ u}{δ n} = 0 & \int_{c} u = 1 \\ \frac{δ u}{δ n} = 0 \end{matrix}

where

C

is the boundary of the rectangle, and discretizing as illustrated below with the square mesh

The

12

by

12

symmetric part of

A

represents the difference equations and the final row comes from the normalizing condition. The example program has

g (x, y) = 1

at all the internal mesh points, but apart from this is written in a general manner so that the number of rows (NROWS) and columns (NCOLS) in the grid can readily be altered.

On entry,	$m < 1$ ,
or	$n < 1$ ,
or	$lruser < 1$ ,
or	$liuser < 1$ .

NAG Library Routine Document

f04qaf (real_gen_sparse_lsqsol)

▸▿ Contents

1

Purpose

2

Specification

3

Description

4

References

5

Arguments

6

Error Indicators and Warnings

7

Accuracy

8

Parallelism and Performance

9

Further Comments

9.1

Description of the Printed Output

10

Example

10.1

Program Text

10.2

Program Data

10.3

Program Results

NAG Library Routine Document

f04qaf (real_gen_sparse_lsqsol)

▸▿ Contents

1 Purpose

2 Specification

3 Description

4 References

5 Arguments

6 Error Indicators and Warnings

7 Accuracy

8 Parallelism and Performance

9 Further Comments

9.1 Description of the Printed Output

10 Example

10.1 Program Text

10.2 Program Data

10.3 Program Results

1

Purpose

2

Specification

3

Description

4

References

5

Arguments

6

Error Indicators and Warnings

7

Accuracy

8

Parallelism and Performance

9

Further Comments

9.1

Description of the Printed Output

10

Example

10.1

Program Text

10.2

Program Data

10.3

Program Results