OpenCL
OpenCL devices are currently supported for DBCSR and DBM/DBT, and can cover GPUs and other devices. Kernels can be automatically tuned.
Note: the OpenCL backend uses some functionality from LIBXSMM (dependency). CP2K’s offload-library serving DBM/DBT and other libraries depends on DBCSR’s OpenCL backend.
Installing OpenCL and preparing the runtime environment
Installing an OpenCL runtime depends on the operating system and the device vendor. Debian for instance brings two packages called
opencl-headersandocl-icd-opencl-devwhich can be present in addition to a vendor-specific installation. The OpenCL header files are only necessary if CP2K/DBCSR is compiled from source. Please note, some implementations ship with outdated OpenCL headers which can prevent using latest features (if an application discovers such features only at compile-time). When building from source, for instancelibOpenCL.sois sufficient at link-time (ICD loader). However, an Installable Client Driver (ICD) is finally necessary at runtime.NVIDIA CUDA, AMD HIP, and Intel OneAPI are fully equipped with an OpenCL runtime (if
opencl-headerspackage is not installed, CPATH can be needed to point into the former installation, similarlyLIBRARY_PATHfor findinglibOpenCL.soat link-time). Installing a minimal or stand-alone OpenCL is also possible, e.g., following the instructions for Debian (or Ubuntu) as given for every release of the Intel Compute Runtime.The environment variable
ACC_OPENCL_VERBOSEprints information at runtime of CP2K about kernels generated (ACC_OPENCL_VERBOSE=2) or executed (ACC_OPENCL_VERBOSE=3) which can be used to check an installation.
Building CP2K with OpenCL-based DBCSR
CP2K’s toolchain supports
--enable-openclto select DBCSR’s OpenCL backend. This can be combined with--enable-cuda(--gpu-veris then imposed) to use a GPU for CP2K’s GRID and PW components (no OpenCL support yet) with DBM’s CUDA implementation to be preferred.For manually writing an ARCH-file, add
-D__OPENCLand-D__DBCSR_ACCtoCFLAGSand add-lOpenCLto theLIBSvariable, i.e.,OFFLOAD_CCandOFFLOAD_FLAGScan duplicateCCandCFLAGS(no special offload compiler needed). Please also setOFFLOAD_TARGET = openclto enable the OpenCL backend in DBCSR. For OpenCL, it is not necessary to specify a GPU version (e.g.,GPUVER = V100would map/limit toexts/dbcsr/src/acc/opencl/smm/params/tune_multiply_V100.csv). In fact,GPUVERlimits tuned parameters to the specified GPU, whereas by default all tuned parameters are embedded (exts/dbcsr/src/acc/opencl/smm/params/*.csv) and applied at runtime. If auto-tuned parameters are not available for DBCSR, well-chosen defaults will be used to populate kernels at runtime.Auto-tuned parameters are embedded into the binary, i.e., CP2K does not rely on a hard-coded location. Setting
OPENCL_LIBSMM_SMM_PARAMS=/path/to/csv-fileenvironment variable can supply parameters for an already built application, orOPENCL_LIBSMM_SMM_PARAMS=0can disable using tuned parameters. Refer to https://cp2k.github.io/dbcsr/ on how to tune kernels (parameters).
Building CP2K with OpenCL-based DBM library
Pass
-DCP2K_USE_ACCEL=OPENCLto CMake in addition to following above instructions for “Building CP2K with OpenCL-based DBCSR”. An additional Makefile rule can be necessary to transform OpenCL code into a ressource header file.