CP_FM_GEMM
Section can be repeated.
Benchmark and test the cp_fm_gemm routines by multiplying C=A*B [Edit on GitHub]
Keywords
Keyword descriptions
- FORCE_BLOCKSIZE: logical = F
Lone keyword:
T
Usage: FORCE_BLOCKSIZE
Forces the blocksize, even if this implies that a few processes might have no data [Edit on GitHub]
- GRID_2D: integer[2] = 1 1
Usage: GRID_2D 64 16
Explicitly set the blacs 2D processor layout. If the product differs from the number of MPI ranks, it is ignored and a default nearly square layout is used. [Edit on GitHub]
- K: integer = 256
Usage: A 1024
Dimension 1 of C [Edit on GitHub]
- M: integer = 256
Usage: A 1024
Inner dimension M [Edit on GitHub]
- N: integer = 256
Usage: A 1024
Dimension 2 of C [Edit on GitHub]
- NCOL_BLOCK: integer = 32
Usage: nrow_block 64
block_size for cols [Edit on GitHub]
- NROW_BLOCK: integer = 32
Usage: nrow_block 64
block_size for rows [Edit on GitHub]
- N_LOOP: integer = 10
Usage: N_LOOP 10
Number of cp_fm_gemm operations being timed (useful for small matrices). [Edit on GitHub]
- ROW_MAJOR: logical = T
Lone keyword:
T
Usage: ROW_MAJOR .FALSE.
Use a row major blacs grid [Edit on GitHub]
- TRANSA: logical = F
Lone keyword:
T
Usage: TRANSA
Transpose matrix A [Edit on GitHub]
- TRANSB: logical = F
Lone keyword:
T
Usage: TRANSB
Transpose matrix B [Edit on GitHub]