|
Anasazi Version of the Day
|
TSQR's six computational kernels. More...
#include <Tsqr_Combine.hpp>
Public Member Functions | |
| void | factor_first (const Ordinal nrows, const Ordinal ncols, Scalar A[], const Ordinal lda, Scalar tau[], Scalar work[]) const |
| void | apply_first (const ApplyType &applyType, const Ordinal nrows, const Ordinal ncols_C, const Ordinal ncols_A, const Scalar A[], const Ordinal lda, const Scalar tau[], Scalar C[], const Ordinal ldc, Scalar work[]) const |
| void | apply_inner (const ApplyType &apply_type, const Ordinal m, const Ordinal ncols_C, const Ordinal ncols_Q, const Scalar A[], const Ordinal lda, const Scalar tau[], Scalar C_top[], const Ordinal ldc_top, Scalar C_bot[], const Ordinal ldc_bot, Scalar work[]) const |
| void | factor_inner (const Ordinal m, const Ordinal n, Scalar R[], const Ordinal ldr, Scalar A[], const Ordinal lda, Scalar tau[], Scalar work[]) const |
| void | factor_pair (const Ordinal n, Scalar R_top[], const Ordinal ldr_top, Scalar R_bot[], const Ordinal ldr_bot, Scalar tau[], Scalar work[]) const |
| void | apply_pair (const ApplyType &apply_type, const Ordinal ncols_C, const Ordinal ncols_Q, const Scalar R_bot[], const Ordinal ldr_bot, const Scalar tau[], Scalar C_top[], const Ordinal ldc_top, Scalar C_bot[], const Ordinal ldc_bot, Scalar work[]) const |
Static Public Member Functions | |
| static bool | QR_produces_R_factor_with_nonnegative_diagonal () |
TSQR's six computational kernels.
This class encapsulates six computational primitives for TSQR (in which R, R_1, and R_2 each represent an n x n upper triangular matrix, A represents an m x n cache block, and C_1 and C_2 represent cache blocks with some number of columsn p):
CombineImpl is a particular implementation. Its interface includes the same six public methods as does Combine's interface. Combine itself just uses CombineImpl's implementations of these six methods.
Definition at line 65 of file Tsqr_Combine.hpp.
| static bool TSQR::Combine< Ordinal, Scalar, CombineImpl >::QR_produces_R_factor_with_nonnegative_diagonal | ( | ) | [inline, static] |
Whether or not the QR factorizations computed by methods of this class produce an R factor with all nonnegative diagonal entries.
Definition at line 76 of file Tsqr_Combine.hpp.
| void TSQR::Combine< Ordinal, Scalar, CombineImpl >::factor_first | ( | const Ordinal | nrows, |
| const Ordinal | ncols, | ||
| Scalar | A[], | ||
| const Ordinal | lda, | ||
| Scalar | tau[], | ||
| Scalar | work[] | ||
| ) | const [inline] |
Compute the QR factorization of the nrows by ncols matrix A (with leading dimension lda). Overwrite the upper triangle of A with the resulting R factor, and the lower trapezoid of A (along with the length ncols tau array) with the implicitly stored Q factor.
| nrows | [in] Number of rows in A |
| ncols | [in] Number of columns in A |
| A | [in/out] On input: the nrows by ncols matrix (in column-major order, with leading dimension lda) to factor. On output: upper triangle contains the R factor, and lower part contains the implicitly stored Q factor. |
| lda | [in] Leading dimension of A |
| tau | [out] Array of length ncols; on output, the scaling factors for the Householder reflectors |
| work | [out] Workspace array of length ncols |
Definition at line 97 of file Tsqr_Combine.hpp.
| void TSQR::Combine< Ordinal, Scalar, CombineImpl >::apply_first | ( | const ApplyType & | applyType, |
| const Ordinal | nrows, | ||
| const Ordinal | ncols_C, | ||
| const Ordinal | ncols_A, | ||
| const Scalar | A[], | ||
| const Ordinal | lda, | ||
| const Scalar | tau[], | ||
| Scalar | C[], | ||
| const Ordinal | ldc, | ||
| Scalar | work[] | ||
| ) | const [inline] |
Apply the Q factor, as computed by factor_first() and stored implicitly in A and tau, to the matrix C.
Definition at line 110 of file Tsqr_Combine.hpp.
| void TSQR::Combine< Ordinal, Scalar, CombineImpl >::apply_inner | ( | const ApplyType & | apply_type, |
| const Ordinal | m, | ||
| const Ordinal | ncols_C, | ||
| const Ordinal | ncols_Q, | ||
| const Scalar | A[], | ||
| const Ordinal | lda, | ||
| const Scalar | tau[], | ||
| Scalar | C_top[], | ||
| const Ordinal | ldc_top, | ||
| Scalar | C_bot[], | ||
| const Ordinal | ldc_bot, | ||
| Scalar | work[] | ||
| ) | const [inline] |
Apply the Q factor stored in [R; A] to [C_top; C_bot]. The C blocks are allowed, but not required, to have different leading dimensions (ldc_top resp. ldc_bottom). R is upper triangular, so we do not need it; the Householder reflectors representing the Q factor are stored compactly in A (specifically, in all of A, not just the lower triangle).
In the "sequential under parallel" version of TSQR, this function belongs to the sequential part (i.e., operating on cache blocks on a single processor).
| apply_type | [in] NoTranspose means apply Q, Transpose means apply Q^T, and ConjugateTranspose means apply Q^H. |
| m | [in] number of rows of A |
| ncols_C | [in] number of columns of [C_top; C_bot] |
| ncols_Q | [in] number of columns of [R; A] |
| A | [in] m by ncols_Q matrix, in which the Householder reflectors representing the Q factor are stored |
| lda | [in] leading dimension of A |
| tau | [in] array of length ncols_Q, storing the scaling factors for the Householder reflectors representing Q |
| C_top | [inout] ncols_Q by ncols_C matrix |
| ldc_top | [in] leading dimension of C_top |
| C_bot | [inout] m by ncols_C matrix |
| ldc_bot | [in] leading dimension of C_bot |
| work | [out] workspace array of length ncols_C |
Definition at line 152 of file Tsqr_Combine.hpp.
| void TSQR::Combine< Ordinal, Scalar, CombineImpl >::factor_inner | ( | const Ordinal | m, |
| const Ordinal | n, | ||
| Scalar | R[], | ||
| const Ordinal | ldr, | ||
| Scalar | A[], | ||
| const Ordinal | lda, | ||
| Scalar | tau[], | ||
| Scalar | work[] | ||
| ) | const [inline] |
Perform one "inner" QR factorization step of sequential / parallel TSQR. (In either case, only one processor calls this routine.)
In the "sequential under parallel" version of TSQR, this function belongs to the sequential part (i.e., operating on cache blocks on a single processor). Only the first cache block $A_0$ is factored as $Q_0 R_0 = A_0$ (see tsqr_factor_first); subsequent cache blocks $A_k$ are factored using this routine, which combines them with $R_{k-1}$.
Here is the matrix to factor: \[ {pmatrix} R_{k-1} \ % $A_{k-1}$ is $m_{k-1} n$ with $m_{k-1} n$ A_k \ % $m_k n$ with $m_k n$ {pmatrix} \]
Since $R_{k-1}$ is n by n upper triangular, we can store the Householder reflectors (representing the Q factor of $[R_{k-1}; A_k]$) entirely in $A_k$ (specifically, in all of $A_k$, not just below the diagonal).
| m | [in] Number of rows in the "bottom" block to factor. The number of rows in the top block doesn't matter, given the assumptions above, as long as $m_{k-1} n$. |
| n | [in] Number of columns (same in both blocks) |
| R | [inout] "Top" upper triangular n by n block $R_{k-1}$. Overwritten with the new R factor $R_k$ of $[R_{k-1}; A_k]$. |
| ldr | [in] Leading dimension of R |
| A | [inout] "Bottom" dense m by n block $A_k$. Overwritten with the Householder reflectors representing the Q factor of $[R_{k-1}; A_k]$. |
| tau | [out] Scaling factors of the Householder reflectors. Corresponds to the TAU output of LAPACK's _GEQRF. |
| work | [out] Workspace (length >= n; don't need lwork or workspace query) |
Definition at line 208 of file Tsqr_Combine.hpp.
| void TSQR::Combine< Ordinal, Scalar, CombineImpl >::factor_pair | ( | const Ordinal | n, |
| Scalar | R_top[], | ||
| const Ordinal | ldr_top, | ||
| Scalar | R_bot[], | ||
| const Ordinal | ldr_bot, | ||
| Scalar | tau[], | ||
| Scalar | work[] | ||
| ) | const [inline] |
Compute QR factorization of [R_top; R_bot]. Store resulting R factor in R_top and Householder reflectors in R_bot.
| n | [in] Number of rows and columns of each of R_top and R_bot |
| R_top | [inout] n by n upper triangular matrix |
| ldr_top | [in] Leading dimension of R_top |
| R_bot | [inout] n by n upper triangular matrix |
| ldr_bot | [in] Leading dimension of R_bot |
| tau | [out] Scaling factors for Householder reflectors |
| work | [out] Workspace array (of length >= n) |
Definition at line 232 of file Tsqr_Combine.hpp.
| void TSQR::Combine< Ordinal, Scalar, CombineImpl >::apply_pair | ( | const ApplyType & | apply_type, |
| const Ordinal | ncols_C, | ||
| const Ordinal | ncols_Q, | ||
| const Scalar | R_bot[], | ||
| const Ordinal | ldr_bot, | ||
| const Scalar | tau[], | ||
| Scalar | C_top[], | ||
| const Ordinal | ldc_top, | ||
| Scalar | C_bot[], | ||
| const Ordinal | ldc_bot, | ||
| Scalar | work[] | ||
| ) | const [inline] |
Apply Q factor (or Q^T or Q^H) of the 2*ncols_Q by ncols_Q matrix [R_top; R_bot] (stored in R_bot and tau) to the 2*ncols_Q by ncols_C matrix [C_top; C_bot]. The two blocks C_top and C_bot may have different leading dimensions (ldc_top resp. ldc_bot).
| apply_type | [in] NoTranspose means apply Q, Transpose means apply Q^T, and ConjugateTranspose means apply Q^H. |
Definition at line 252 of file Tsqr_Combine.hpp.
1.7.4