Foreword and FAQ
It is our great pleasure to present CP2K, an open-source software package for ab initio electronic structure calculations in atomistic simulations. The code is written in Fortran 2008 and has been geared towards large-scale, high-performance CPU and GPU computation with multi-threading, MPI, CUDA and HIP parallelization. For an overview of the capabilities, see Features.
While CP2K started as an implementation of quantum chemical methods (more specifically, the
QUICKSTEP module, as presented at VandeVondele2005) for molecular dynamics simulation,
decades of ensuing development has witnessed a vast team of collaborators with their innumerable
contributions and the ever-growing user base with their valuable feedbacks, to whom we wish to
express sincere gratitude. As CP2K is freely available in various ways and does not requite
registration to use, it is difficult to gather accurate usage stats; but the
list of publications using CP2K speaks for itself.
We would like to ask you, users of CP2K, to acknowledge our work by citing the publications as listed on the Bibliography page and printed as REFERENCES at the end of output log of the program, in particular the review articles:
Kühne2020, on the theoretical background and algorithms;
Iannuzzi2026, on the practical usage and applications.
We have prepared a list of Q&A for frequently asked things below, which we hope can be helpful for the experience with the CP2K package and the art of computational chemistry in general.
This program is provided “as-is” without any expressed or implied warranty.
Firstly, what does the name CP2K stand for?
Simply put, “CP” means Car-Parrinello, the initials of two scientists, and “2K” means year 2000.
Historically there were two formulations developed for ab initio molecular dynamics (MD):
the Car-Parrinello Molecular Dynamics (CPMD), and the Born-Oppenheimer Molecular Dynamics
(BOMD). A program named simply also as CPMD began its development back in the 1990s
featuring the Car-Parrinello Molecular Dynamics; the sister project, named as CP2K, would become its
spiritual successor in the wake of the New Millenium.
Due to the fact that the original CPMD formulation has not actually been implemented yet, the name CP2K may sound slightly non-indicative. The BOMD formulation is the major one employed by CP2K and has seen mainstream applications in a variety of fields in the 21st century.
Can I try CP2K out somewhere before installation?
Yes. The CP2K Lab is a spin-off commercial platform set up by developers for building structures, writing input files, executing jobs on the cloud and analyzing the outputs. Once signed up for free, the free-tier features already allow for experiments with lightweight computation like the one in Run a First Calculation. For those in need of more resources and functionalities, the platform offers paid tier, site license and on-premise enterprise support.
What preliminary knowledge does using CP2K need?
Practically CP2K is built and executed in some Linux-based operating systems, ranging from on physical high-performance computers for production to in virtual machines for quick small tests. This implies the need of Linux knowledge including its file system, paths, user privileges and permissions, environment variables, shells (most commonly Bash and POSIX), utility commands, stdin/stdout/stderr, piping and redirects, shell scripts, and modules and library files. In addition, having some experience with Fortran, C, and C++ compilers as well as CMake will be helpful for configuration and installation. Optionally, learn about upper-level management via job schedulers and queue systems.
On the science side, introductory courses on chemistry, solid-state physics, and statistical mechanics are vital prerequisites before carrying out computer simulations just as before performing experiments in real life. Moreover, it is mandatory to have a clear understanding about the theoretical methods in simulation; their characteristics and performance, strengths and limitations should be described in the publications in the original conception and follow-up benchmarks. There are two polar opposite pitfalls to be avoided: it is easy to overlook the subtleties and adjust input settings mindlessly hoping that the black box somehow works, but it is also easy to become absorbed in the maths and spend a lot of time trying to work out the equations that is not the focus of the actual research project.
Note
Worth stressing are two overarching aspects of computer simulation:
In spite of ever-growing scientific computing power, most of the time it is not affordable to have an exact 1:1 computational model of the real-life phenomena of interest. The usual practice is to use a much scaled-down model with limited number of atoms and finite length of trajectory for the simulation, which should achieve the delicate balance between representativeness and feasibility. The discrepancy of space and time scales between simulation and reality can be easily neglected due to a lack of awareness of the kinetics, especially for slow, rare events with high energy barrier that can only be observed in an extended period of time in real life.
There is no need to worry if a theoretical method is strictly ab initio or not; both styles of deriving methods, “starting from physically rigorous and universal first principles” and “taking empirical results into account by fitting parameters with extra data”, are capable of producing useful algorithms and accurate results depending on the case. The real concern is better put on the performance of methods on the target system of interest, which should have been benchmarked in existing works of the particular subdivision of science; also, the similarity between primitive datasets on which empirical parameters of the method (if any) are fitted and the target system of interest can be telling.
How do I create the atomistic model for CP2K input?
This is done with external visualization and construction programs. Considering that a task of geometry and/or cell optimization is usually the very first CP2K job, some general rules are discussed on the relevant documentation page.
If available, computational databases and benchmark sets are the most recommended avenue to obtain structures due to having already been subject to some electronic-structure calculation. Even the cheap methods and loose thresholds in a high-throughput screening and optimization can make the structure qualitatively reasonable by chemical and physical intuitions, although further optimization is still needed.
On the other hand, structures that are from experimental characterization are frequently not
“computation-ready”, and thus should not be subject to computation without careful validation in
pre-processing. This can be prominent for cif and pdb structures determined by powder or single-
crystal XRD which can be affected by sample quality and thermal motion.
Watch out for crystallographic disorder and atoms with low resolution or fractional occupation: using the superposition of all atoms as if every occupancy is 1.00 is highly likely to introduce contacting or even overlapping atoms.
Beware of composition: the atomic structure may not match the intended macroscopic, charge-neutral chemical formula, owing to missing or duplicated hydrogen atoms, small counter ions, solvent or ligand molecules.
Possible resolutions vary from simple manual editing in the modelling stage, to utilization of supercells and enumeration of special quasirandom structures (common for materials with dopants), and to more rigorous XRD refinement and application of quantum crystallography methods. It is believed that further advancements in instrumental analysis and structure resolution techniques would eventually benefit computational chemistry greatly.
Do I need PBC for my model?
Periodic boundary condition (PBC) is a fundamental feature of CP2K, covering the full range of dimensionalities of translational symmetry from 3D, 2D, 1D to 0D. The key distinction is how connectivity, neighbor lists and integration grids are generated, how the Poisson solver handles the electrostatic interaction, and how translational and rotational degrees of freedom of the center of mass (i.e. collective motion as a whole) are treated.
If the structure involves condensed-phase matter, such as liquid solution, solid crystal, surface slab and other one- and two-dimensional nano-materials, then generally PBC is used. This is also applicable to systems with no actual well-defined repeating units like the bulk solutions. A huge liquid droplet in the gaseous phase, where the diameter is so large that the gas-liquid interface is almost flat and surface tension is negligible, may just as well be modelled as a combination of a bulk solution system and an interface between a gaseous/vacuum region and a thin layer of solution, both of which make use of PBC even though the liquid droplet itself is not periodic. However, it may be necessary to validate the size of PBC against target properties to confirm that it is sufficiently large for sampling, sometimes with the minimum image convention in mind.
Isolated molecular clusters in the gaseous phase or vacuum, where external pressure is irrelevant, can be simulated without PBC. A frequent question is why a molecule optimized in vacuum does not match its crystal structure; this is because the ordered packing pattern in the crystalline form creates an environment capable of driving conformational changes. Oftentimes literatures convert a periodic structure to an isolated model of finite size and apply modifications on the edge in the form of terminal capping atoms/groups or point charges; these treatments are usually intended to adapt the structure to quantum chemical softwares with no PBC support, but in CP2K they may not offer extra advantages over an appropriate PBC for translational symmetry.
In certain cases, the same process can be simulated both with and without PBC. For example, the reaction between hydroxyl and hydrogen may be modelled as a single \(\mathrm{H_2}\) molecule colliding with a single \(\mathrm{OH}\) molecule with different relative orientations, distances and velocities, which does not need PBC, or modelled as a mixture of numerous \(\mathrm{H_2}\) and \(\mathrm{OH}\) molecules, which needs PBC. Their behavior regarding responses to external conditions including temperature, pressure, and any form of energy input may be different, but they provide insights from distinct perspectives.
Does CP2K support k-points?
As an essential element for solid-state electronic structure, there is of course support for
k-points in a broad sense in the QUICKSTEP module of CP2K. A few specialized features may not have
complete, verified program implementations for k-point support, or are based on theories and
algorithms that do not have an updated k-point version (compared with an isolated, non-periodic
formalism) to begin with. After all, it is not a far stretch to think that a novel k-point
generalization to existing methods is worthy of one or more academic publications and takes serious
collaboration and devoted efforts to investigate.
The development status and user opinions about k-point supports can be found at the dedicated github issue; any request for new features of this kind requires providing a reference implementation of k-point formalism in other softwares.
Where can I meet the CP2K community?
Several discussion venues are available:
The User Forum hosted on Google Groups, with a read-only mirror and a downloadable archives. To use the forum, sign in with a Google account, apply to join and then wait for approval.
The issues and discussions of the official github repository.
The Matter Modeling Stack Exchange has, among other topics, a tag for CP2K.
For Chinese users, there is also a CP2K category in the First-principles subforum of the Computational Chemistry Commune.
Please note that the github issues and discussions are only intended for topics relevant to the program development and code implementation, such as reproducible bug reports, well-defined feature requests and revisions to the documentation or manual. For more general help on the usage, as well as unexpected behaviors that may or may not be bugs, check the other venues first; experts can handle the questions and determine if they are eligible to be brought to github issues.
What is the best practice to ask questions?
The general etiquette for requesting tech support online has been summarized nicely by Eric S. Raymond’s How To Ask Questions The Smart Way. (Disclaimer: this link does not imply any connection between the original author and the CP2K developers, nor does it suggest that the original author may be contacted for assistance.)
Before submitting a question, please compose it with sufficient details, accuracy, and clarity. Approach the process in the same way as making a presentation to general audience, or even writing the “Methods” section in a formal academic publication; this includes giving explanations to uncommon acronyms (say, the abbreviated name of a specific class of materials, or anything that is not on the Acronyms page) and traceable citations (with publication title, date, and DOI, instead of merely showing a screenshot or a paragraph of copy-pasted text). The release date or git version of CP2K, and custom revisions if any, has to be mentioned in the first place.
For problems related to installation and/or performance, the hardware specification and the configuration for linked libraries should be explained. The distribution source and means of preparation of dependencies, like with package managers, environment-controlling modules, or just a build from source, need clarifying.
For error terminations and wrong results, it is imperative to provide a complete input deck and the
output files. The “input deck” encompasses not only the main input file with keyword settings, but
also all of the external files referenced inside unless they are available under the official data
directory, so that the job can be actually run and tested. Instead of the original intended chemical
structure and composition, it is better to use a simplified system that triggers the malfunction
reliably; this prevents confidential research information to be disclosed and reduces the demand on
computational resources to ease the load of computers on the developer side.
Please refrain from talking about CP2K-specific suggestions from generic large language model (LLM) or other types of artificial intelligence (AI). Even if the AIs have been trained on a refined and verified corpus of CP2K-oriented information one day, they can still hallucinate and generate superficially convincing but scientifically incorrect responses. As with academic publications, the human author is responsible for the correctness of any content produced by AIs.
Lastly, please kindly understand that, despite the CP2K developers having knowledge about the algorithm infrastructures and program implementations, they may not be suitable for answering all of the questions arising from practice, especially those pertaining to niche research areas where apprehending the science and acquiring the skills will require much more extensive academic training than learning to use a program. The best party to consult for guidance of this type would be the tutor, advisor, experienced colleagues or collaborators in real life, and when attempting to reproduce reported findings, the original authors. This is not denying any personal potential to teach oneself at no cost, but rather hinting the necessity of communicating with the right professional people which does not have substitutes.
May I join in development and send patches?
Certainly! CP2K welcomes all sorts of contributions, from a small typo fix to modular code refactoring, to interfaces with other packages, to novel implementation of cutting-edge technology… Sharing kindness is an easy feat, and patches makes it more complete, that is the essence of open-source programming.
The CP2K project uses git as the version control tool, and the official code repository is on
github as cp2k. For detailed instructions see the page
Starting development.
Another form of contribution is to enrich the cp2k-examples repository with example inputs and outputs, complete with post-analysis workflow down to straight publishable results and discussions if possible. This will help other curious users see the full potential of CP2K in terms of scientific and engineering applications.