NAG Library Routine Document

Integer, Intent (In)	::	n, nrhs, lda, ldb, ldx
Integer, Intent (Out)	::	ipiv(n), iter, info
Real (Kind=nag_wp), Intent (Out)	::	rwork(n)
Complex (Kind=nag_wp), Intent (In)	::	b(ldb,*)
Complex (Kind=nag_wp), Intent (Inout)	::	a(lda,), x(ldx,)
Complex (Kind=nag_wp), Intent (Out)	::	work(n*nrhs)
Complex (Kind=nag_rp), Intent (Out)	::	swork(n*(n+nrhs))

C Header Interface

#include <nagmk26.h>

void	f07aqf_ (const Integer n, const Integer nrhs, Complex a[], const Integer lda, Integer ipiv[], const Complex b[], const Integer ldb, Complex x[], const Integer ldx, Complex work[], Complexf swork[], double rwork[], Integer iter, Integer *info)

The routine may be called by its LAPACK name zcgesv.

3

Description

f07aqf (zcgesv) first attempts to factorize the matrix in single precision and use this factorization within an iterative refinement procedure to produce a solution with double precision accuracy. If the approach fails the method switches to a double precision factorization and solve.

The iterative refinement process is stopped if

iter > itermax,

where iter is the number of iterations carried out thus far and

itermax

is the maximum number of iterations allowed, which is fixed at

30

iterations. The process is also stopped if for all right-hand sides we have

‖resid‖ < \sqrt{n} ‖x‖ ‖A‖ ε,

where

‖resid‖

is the

\infty

-norm of the residual,

‖x‖

is the

\infty

-norm of the solution,

‖A‖

is the

\infty

-operator-norm of the matrix

A

and

ε

is the machine precision returned by x02ajf.

The iterative refinement strategy used by f07aqf (zcgesv) can be more efficient than the corresponding direct full precision algorithm. Since this strategy must perform iterative refinement on each right-hand side, any efficiency gains will reduce as the number of right-hand sides increases. Conversely, as the matrix size increases the cost of these iterative refinements become less significant relative to the cost of factorization. Thus, any efficiency gains will be greatest for a very small number of right-hand sides and for large matrix sizes. The cut-off values for the number of right-hand sides and matrix size, for which the iterative refinement strategy performs better, depends on the relative performance of the reduced and full precision factorization and back-substitution. For now, f07aqf (zcgesv) always attempts the iterative refinement strategy first; you are advised to compare the performance of f07aqf (zcgesv) with that of its full precision counterpart f07anf (zgesv) to determine whether this strategy is worthwhile for your particular problem dimensions.

4

References

Anderson E, Bai Z, Bischof C, Blackford S, Demmel J, Dongarra J J, Du Croz J J, Greenbaum A, Hammarling S, McKenney A and Sorensen D (1999) LAPACK Users' Guide (3rd Edition) SIAM, Philadelphia http://www.netlib.org/lapack/lug

Buttari A, Dongarra J, Langou J, Langou J, Luszczek P and Kurzak J (2007) Mixed precision iterative refinement techniques for the solution of dense linear systems International Journal of High Performance Computing Applications

Golub G H and Van Loan C F (1996) Matrix Computations (3rd Edition) Johns Hopkins University Press, Baltimore

5

Arguments

1: $n$ – IntegerInput

On entry:

n

, the number of linear equations, i.e., the order of the matrix

A

Constraint:

n \geq 0

2: $nrhs$ – IntegerInput

On entry:

r

, the number of right-hand sides, i.e., the number of columns of the matrix

B

Constraint:

nrhs \geq 0

3: $a (lda *)$ – Complex (Kind=nag_wp) arrayInput/Output

Note: the second dimension of the array a must be at least

\max (1, n)

On entry: the

n

n

coefficient matrix

A

On exit: if iterative refinement has been successfully used (i.e., if

info = 0

and

iter \geq 0

), then

A

is unchanged. If double precision factorization has been used (when

info = 0

and

iter < 0

A

contains the factors

L

and

U

from the factorization

A = P L U

; the unit diagonal elements of

L

are not stored.

4: $lda$ – IntegerInput

On entry: the first dimension of the array a as declared in the (sub)program from which f07aqf (zcgesv) is called.

Constraint:

lda \geq \max (1, n)

5: $ipiv (n)$ – Integer arrayOutput

On exit: if no constraints are violated, the pivot indices that define the permutation matrix

P

; at the

i

th step row

i

of the matrix was interchanged with row

ipiv (i)

ipiv (i) = i

indicates a row interchange was not required.

ipiv

corresponds either to the single precision factorization (if

info = 0

and

iter \geq 0

) or to the double precision factorization (if

info = 0

and

iter < 0

6: $b (ldb *)$ – Complex (Kind=nag_wp) arrayInput

Note: the second dimension of the array b must be at least

\max (1, nrhs)

On entry: the

n

r

right-hand side matrix

B

7: $ldb$ – IntegerInput

On entry: the first dimension of the array b as declared in the (sub)program from which f07aqf (zcgesv) is called.

Constraint:

ldb \geq \max (1, n)

8: $x (ldx *)$ – Complex (Kind=nag_wp) arrayOutput

Note: the second dimension of the array x must be at least

\max (1, nrhs)

On exit: if

info = 0

, the

n

r

solution matrix

X

9: $ldx$ – IntegerInput

On entry: the first dimension of the array x as declared in the (sub)program from which f07aqf (zcgesv) is called.

Constraint:

ldx \geq \max (1, n)

10: $work (n * nrhs)$ – Complex (Kind=nag_wp) arrayWorkspace

11: $swork (n \times (n + nrhs))$ – Complex (Kind=nag_rp) arrayWorkspace

Note: this array is utilized in the reduced precision computation, consequently its type nag_rp reflects this usage.

12: $rwork (n)$ – Real (Kind=nag_wp) arrayWorkspace

13: $iter$ – IntegerOutput

On exit: if

iter > 0

, iterative refinement has been successfully used and iter is the number of iterations carried out.

iter < 0

, iterative refinement has failed for one of the reasons given below and double precision factorization has been carried out instead.

$iter = - 1$: Taking into account machine parameters, and the values of n and nrhs, it is not worth working in single precision.
$iter = - 2$: Overflow of an entry occurred when moving from double to single precision.
$iter = - 3$: An intermediate single precision factorization failed.
$iter = - 31$: The maximum permitted number of iterations was exceeded.

14: $info$ – IntegerOutput

On exit:

info = 0

unless the routine detects an error (see Section 6).

6

Error Indicators and Warnings

$info < 0$: If $info = - i$ , argument $i$ had an illegal value. An explanatory message is output, and execution of the program is terminated.

$info > 0$: Element $〈value〉$ of the diagonal is exactly zero. The factorization has been completed, but the factor $U$ is exactly singular, so the solution could not be computed.

7

Accuracy

The computed solution for a single right-hand side,

\hat{x}

, satisfies the equation of the form

(A + E) \hat{x} = b,

where

{‖E‖}_{1} = O (ε) {‖A‖}_{1}

and

ε

is the machine precision. An approximate error bound for the computed solution is given by

\frac{{‖\hat{x} - x‖}_{1}}{{‖x‖}_{1}} \leq κ (A) \frac{{‖E‖}_{1}}{{‖A‖}_{1}}

where

κ (A) = {‖A^{- 1}‖}_{1} {‖A‖}_{1}

, the condition number of

A

with respect to the solution of the linear equations. See Section 4.4 of Anderson et al. (1999) for further details.

8

Parallelism and Performance

f07aqf (zcgesv) is threaded by NAG for parallel execution in multithreaded implementations of the NAG Library.

f07aqf (zcgesv) makes calls to BLAS and/or LAPACK routines, which may be threaded within the vendor library used by this implementation. Consult the documentation for the vendor library for further information.

Please consult the X06 Chapter Introduction for information on how to control and interrogate the OpenMP environment used within this routine. Please also consult the Users' Note for your implementation for any additional implementation-specific information.

9

Further Comments

The real analogue of this routine is f07acf (dsgesv).

10

Example

This example solves the equations

A x = b,

where

A

is the general matrix

A = (\begin{array}{r} - 1.34 + 2.55 i & 0.28 + 3.17 i & - 6.39 - 2.20 i & 0.72 - 0.92 i \\ - 0.17 - 1.41 i & 3.31 - 0.15 i & - 0.15 + 1.34 i & 1.29 + 1.38 i \\ - 3.29 - 2.39 i & - 1.91 + 4.42 i & - 0.14 - 1.35 i & 1.72 + 1.35 i \\ 2.41 + 0.39 i & - 0.56 + 1.47 i & - 0.83 - 0.69 i & - 1.96 + 0.67 i \end{array}) and b = (\begin{array}{r} 26.26 + 51.78 i \\ 6.43 - 8.68 i \\ - 5.75 + 25.31 i \\ 1.16 + 2.57 i \end{array}) .

NAG Library Routine Document

f07aqf (zcgesv)

▸▿ Contents

1 Purpose

2 Specification

3 Description

4 References

5 Arguments

6 Error Indicators and Warnings

7 Accuracy

8 Parallelism and Performance

9 Further Comments

10 Example

10.1 Program Text

10.2 Program Data

10.3 Program Results

1

Purpose

2

Specification

3

Description

4

References

5

Arguments

6

Error Indicators and Warnings

7

Accuracy

8

Parallelism and Performance

9

Further Comments

10

Example

10.1

Program Text

10.2

Program Data

10.3

Program Results