It differs from f08aec in that it: requires an explicit block size; stores reflector factors that are upper triangular matrices of the chosen block size (rather than scalars); and recursively computes the

Q R

factorization based on the algorithm of Elmroth and Gustavson (2000).

m \geq n

, the factorization is given by:

A = Q (\begin{array}{r} R \\ 0 \end{array}),

where

R

is an

n

n

upper triangular matrix and

Q

is an

m

m

orthogonal matrix. It is sometimes more convenient to write the factorization as

A = (\begin{array}{r} Q_{1} & Q_{2} \end{array}) (\begin{array}{r} R \\ 0 \end{array}),

which reduces to

A = Q_{1} R,

where

Q_{1}

consists of the first

n

columns of

Q

, and

Q_{2}

the remaining

m - n

columns.

m < n

R

is upper trapezoidal, and the factorization can be written

A = Q (\begin{array}{r} R_{1} & R_{2} \end{array}),

where

R_{1}

is upper triangular and

R_{2}

is rectangular.

The matrix

Q

is not formed explicitly but is represented as a product of

\min (m, n)

elementary reflectors (see the F08 Chapter Introduction for details). Functions are provided to work with

Q

in this representation (see Section 9).

Note also that for any

k < n

, the information returned represents a

Q R

factorization of the first

k

columns of the original matrix

A

4 References

Elmroth E and Gustavson F (2000) Applying Recursion to Serial and Parallel

Q R

Factorization Leads to Better Performance IBM Journal of Research and Development. (Volume 44) 4 605–624

Golub G H and Van Loan C F (2012) Matrix Computations (4th Edition) Johns Hopkins University Press, Baltimore

5 Arguments

1: $order$ – Nag_OrderType Input

On entry: the order argument specifies the two-dimensional storage scheme being used, i.e., row-major ordering or column-major ordering. C language defined storage is specified by

order = Nag_RowMajor

. See Section 3.1.3 in the Introduction to the NAG Library CL Interface for a more detailed explanation of the use of this argument.

Constraint:

order = Nag_RowMajor

Nag_ColMajor

2: $m$ – Integer Input

On entry:

m

, the number of rows of the matrix

A

Constraint:

m \geq 0

3: $n$ – Integer Input

On entry:

n

, the number of columns of the matrix

A

Constraint:

n \geq 0

4: $nb$ – Integer Input

On entry: the explicitly chosen block size to be used in computing the

Q R

factorization. See Section 9 for details.

Constraint: if

\min (m, n) > 0

1 \leq nb \leq \min (m, n)

5: $a [\dim]$ – double Input/Output

Note: the dimension, dim, of the array a must be at least

$\max (1, pda \times n)$ when $order = Nag_ColMajor$ ;
$\max (1, m \times pda)$ when $order = Nag_RowMajor$ .

The

(i, j)

th element of the matrix

A

is stored in

$a [(j - 1) \times pda + i - 1]$ when $order = Nag_ColMajor$ ;
$a [(i - 1) \times pda + j - 1]$ when $order = Nag_RowMajor$ .

On entry: the

m

n

matrix

A

On exit: if

m \geq n

, the elements below the diagonal are overwritten by details of the orthogonal matrix

Q

and the upper triangle is overwritten by the corresponding elements of the

n

n

upper triangular matrix

R

m < n

, the strictly lower triangular part is overwritten by details of the orthogonal matrix

Q

and the remaining elements are overwritten by the corresponding elements of the

m

n

upper trapezoidal matrix

R

6: $pda$ – Integer Input

On entry: the stride separating row or column elements (depending on the value of order) in the array a.

Constraints:

if $order = Nag_ColMajor$ , $pda \geq \max (1, m)$ ;
if $order = Nag_RowMajor$ , $pda \geq \max (1, n)$ .

7: $t [\dim]$ – double Output

Note: the dimension, dim, of the array t must be at least

$\max (1, pdt \times \min (m, n))$ when $order = Nag_ColMajor$ ;
$\max (1, nb \times pdt)$ when $order = Nag_RowMajor$ .

The

(i, j)

th element of the matrix

T

is stored in

$t [(j - 1) \times pdt + i - 1]$ when $order = Nag_ColMajor$ ;
$t [(i - 1) \times pdt + j - 1]$ when $order = Nag_RowMajor$ .

On exit: further details of the orthogonal matrix

Q

. The number of blocks is

b = ⌈\frac{k}{nb}⌉

, where

k = \min (m, n)

and each block is of order nb except for the last block, which is of order

k - (b - 1) \times nb

. For each of the blocks, an upper triangular block reflector factor is computed:

T_{1}, T_{2}, \dots, T_{b}

. These are stored in the

nb

n

matrix

T

T = [T_{1} | T_{2} | \dots | T_{b}]

8: $pdt$ – Integer Input

On entry: the stride separating row or column elements (depending on the value of order) in the array t.

Constraints:

if $order = Nag_ColMajor$ , $pdt \geq nb$ ;
if $order = Nag_RowMajor$ , $pdt \geq \max (1, \min (m, n))$ .

9: $fail$ – NagError * Input/Output

The NAG error argument (see Section 7 in the Introduction to the NAG Library CL Interface).

6 Error Indicators and Warnings

NE_ALLOC_FAIL: Dynamic memory allocation failed.
See Section 3.1.2 in the Introduction to the NAG Library CL Interface for further information.
NE_BAD_PARAM: On entry, argument $〈value〉$ had an illegal value.
NE_INT: On entry, $m = 〈value〉$ .
Constraint: $m \geq 0$ .

On entry, $n = 〈value〉$ .
Constraint: $n \geq 0$ .
NE_INT_2: On entry, $pda = 〈value〉$ and $m = 〈value〉$ .
Constraint: $pda \geq \max (1, m)$ .

On entry, $pda = 〈value〉$ and $n = 〈value〉$ .
Constraint: $pda \geq \max (1, n)$ .

On entry, $pdt = 〈value〉$ and $nb = 〈value〉$ .
Constraint: $pdt \geq nb$ .
NE_INT_3: On entry, $nb = 〈value〉$ , $m = 〈value〉$ and $n = 〈value〉$ .
Constraint: if $\min (m, n) > 0$ , $1 \leq nb \leq \min (m, n)$ .

On entry, $pdt = 〈value〉$ , $m = 〈value〉$ and $n = 〈value〉$ .
Constraint: $pdt \geq \max (1, \min (m, n))$ .
NE_INTERNAL_ERROR: An internal error has occurred in this function. Check the function call and any array sizes. If the call is correct then please contact NAG for assistance.
See Section 7.5 in the Introduction to the NAG Library CL Interface for further information.
NE_NO_LICENCE: Your licence key may have expired or may not have been installed correctly.
See Section 8 in the Introduction to the NAG Library CL Interface for further information.

7 Accuracy

The computed factorization is the exact factorization of a nearby matrix

(A + E)

, where

{‖E‖}_{2} = O (ε) {‖A‖}_{2},

and

ε

is the machine precision.

8 Parallelism and Performance

f08abc makes calls to BLAS and/or LAPACK routines, which may be threaded within the vendor library used by this implementation. Consult the documentation for the vendor library for further information.

Please consult the X06 Chapter Introduction for information on how to control and interrogate the OpenMP environment used within this function. Please also consult the Users' Note for your implementation for any additional implementation-specific information.

9 Further Comments

The total number of floating-point operations is approximately

\frac{2}{3} n^{2} (3 m - n)

m \geq n

\frac{2}{3} m^{2} (3 n - m)

m < n

To apply

Q

to an arbitrary real rectangular matrix

C

, f08abc may be followed by a call to f08acc. For example,

nag_lapackeig_dgemqrt(order,Nag_LeftSide,Nag_Trans,m,p,MIN(m,n),nb,a,pda,t,pdt, 
c,pdc,&fail)

forms

C = Q^{T} C

, where

C

m

p

To form the orthogonal matrix

Q

explicitly, simply initialize the

m

m

matrix

C

to the identity matrix and form

C = Q C

using f08acc as above.

The block size, nb, used by f08abc is supplied explicitly through the interface. For moderate and large sizes of matrix, the block size can have a marked effect on the efficiency of the algorithm with the optimal value being dependent on problem size and platform. A value of

nb = 64 ≪ \min (m, n)

is likely to achieve good efficiency and it is unlikely that an optimal value would exceed

340

To compute a

Q R

factorization with column pivoting, use f08bbc or f08bec.

The complex analogue of this function is f08apc.

10 Example

This example solves the linear least squares problems

minimize {‖A x_{i} - b_{i}‖}_{2}, i = 1, 2

where

b_{1}

and

b_{2}

are the columns of the matrix

B

A = (\begin{array}{r} - 0.57 & - 1.28 & - 0.39 & 0.25 \\ - 1.93 & 1.08 & - 0.31 & - 2.14 \\ 2.30 & 0.24 & 0.40 & - 0.35 \\ - 1.93 & 0.64 & - 0.66 & 0.08 \\ 0.15 & 0.30 & 0.15 & - 2.13 \\ - 0.02 & 1.03 & - 1.43 & 0.50 \end{array}) and B = (\begin{array}{r} - 2.67 & 0.41 \\ - 0.55 & - 3.10 \\ 3.34 & - 4.01 \\ - 0.77 & 2.76 \\ 0.48 & - 6.17 \\ 4.10 & 0.21 \end{array}) .

Interfaces: FL CL AD

NAG CL Interface Introduction

F08 (Lapackeig) Chapter Contents

F08 (Lapackeig) Chapter Introduction

f08ab: FL CL AD

NAG CL Interfacef08abc (dgeqrt)

▸▿ Contents

1 Purpose

2 Specification

3 Description

4 References

5 Arguments

6 Error Indicators and Warnings

7 Accuracy

8 Parallelism and Performance

9 Further Comments

10 Example

10.1 Program Text

10.2 Program Data

10.3 Program Results

NAG CL Interface
f08abc (dgeqrt)