GW + Bethe-Salpeter equation
The Bethe-Salpeter equation (BSE) is a method for computing electronic excitation energies and optical absorption spectra. We describe in Sec. 1 the theory and implementation of BSE, in Sec. 2 the BSE input keywords and in Sec. 3 a full CP2K input file of a BSE calculation and the corresponding output. For reviews on BSE, see [Blase2018, Blase2020, Bruneval2015, Sander2015].
1. Theory and implementation of BSE
A central goal of a BSE calculation is to compute electronic excitation energies \(\Omega^{(n)}, n=1,2,\ldots\).
The following ingredients are necessary for computing \(\Omega^{(n)}\):
Occupied Kohn-Sham (KS) orbitals \(\varphi_i(\mathbf{r})\) and empty KS orbitals \(\varphi_a(\mathbf{r})\) from a DFT calculation, where \(i=1,\ldots,N_\mathrm{occ}\) and \(a=N_\mathrm{occ}+1,\ldots,N_\mathrm{occ}+N_\mathrm{empty}\),
GW eigenvalues \(\varepsilon_i^{GW}\) and \(\varepsilon_a^{GW}\) of corresponding KS orbitals.
It is possible to use G0W0, evGW0 or evGW eigenvalues, see details in GW and in Ref. [Golze2019], i.e. we perform BSE@G0W0/evGW0/evGW@DFT. Thus, also input parameters for a DFT and GW calculation influence the BSE calcualation (see full input in Sec. 3.1). We obtain optical properties from BSE solving the following generalized eigenvalue problem that involves the block matrix \(ABBA\):
We abbreviate \(A\) and \(B\) as matrices with index \(A_{ia,jb}\), i.e. they have \(N_\mathrm{occ}N_\mathrm{empty}\) rows and \(N_\mathrm{occ}N_\mathrm{empty}\) columns. The entries of \(A\) and
where \(\delta_{ij}\) is the Kronecker delta. The user sets \(\alpha^S=2\) for computing singlet excitations and \(\alpha^T=0\) for computing triplet excitations. \(v_{pq,rs}\) is the bare Coulomb interaction and \(W_{pq,rs}(\omega=0)\) the statically (\(\omega=0\)) screened Coulomb interaction , where \(p,q,r,s \in [ 1, N_\mathrm{occ}+N_\mathrm{empty}]\) are KS orbital indices. \((\mathbf{X}^{(n)},\mathbf{Y}^{(n)})\) with elements \(X_{ia}^{(n)}\) and \(Y_{ia}^{(n)}\) are the eigenvectors of the excitation \(n\) which relate to the wavefunction of the electronic excitation [Blase2020],
i.e. \(X_{ia}^{(n)}\) and \(Y_{ia}^{(n)}\) describe the transition amplitude between occupied orbital \(\varphi_i\) and empty orbital \(\varphi_a\) of the \(n\)-th excitation.
The Tamm-Dancoff approximation (TDA) is also implemented, which constrains \(B=0\). In case \(A\) is positive definite, and excitation energies \(\Omega^{(n)}>0\), we have \(\mathbf{Y}=0\) and \(\mathbf{X}\) can be computed from the Hermitian eigenvalue problem
Diagonalizing \(A\) in TDA, or the full block-matrix \(ABBA\), takes in the order of \((N_\mathrm{occ} N_\mathrm{empty})^3\) floating point operations. This translates to a computational scaling of \(O(N^6)\) in the system size \(N\).
2. BSE input
For starting a BSE calculation one needs to set the RUN_TYPE to
ENERGY
and the following sections for GW and BSE:
&GW
SELF_CONSISTENCY G0W0 ! can be changed to EV_GW0 or EV_GW
&BSE
TDA TDA+ABBA ! Diagonalizing ABBA and A
SPIN_CONFIG SINGLET ! or TRIPLET
NUM_PRINT_EXC 20 ! Number of printed excitations
ENERGY_CUTOFF_OCC -1 ! Set to positive numbers (eV) to
ENERGY_CUTOFF_EMPTY -1 ! truncate matrices A_ia,jb and B_ia,jb
&END BSE
&END GW
In the upper GW/BSE section, the following keywords have been used:
SELF_CONSISTENCY: Determines which GW self-consistency (G0W0, evGW0 or evGW) is used to calculate the single-particle GW energies \(\varepsilon_p^{GW}\) needed in the BSE calculation.
TDA: Three options available:
ON
diagonalize \(A\),OFF
generalized diagonalization of \(ABBA\),TDA+ABBA
CP2K diagonalizes \(ABBA\) as well as \(A\).
SPIN_CONFIG: Two options available: Choose between
SINGLET
for computing singlet excitation energies \((\alpha^S = 2)\) andTRIPLET
for computing triplet excitation energies \((\alpha^T=0)\). Standard isSINGLET
as an electronic excitation directly after photoexcitation is a singlet due to angular momentum conservation; triplet excited states can form by intersystem crossing.NUM_PRINT_EXC: Number of excitations \(N_\text{exc}^\text{print}\) to be printed.
ENERGY_CUTOFF_OCC \(E_\text{cut}^\text{occ}\): Restrict occupied molecular orbital (MO) indices \(i\) and only use occupied MOs with \(\varepsilon_i\in[\varepsilon_{i=\text{HOMO}}^{GW}-E_\text{cut}^\text{occ},\varepsilon_{i=\text{HOMO}}^{GW}]\). Setting a small
ENERGY_CUTOFF_OCC
drastically reduces the computation time and the memory consumption, but also might affect the computed excitation energies \(\Omega^{(n)}\). Recommended to use for large systems with more than 30 atoms, but we recommend a careful convergence test by increasingENERGY_CUTOFF_OCC
and observing the effect on \(\Omega^{(n)}\) [Liu2020].ENERGY_CUTOFF_EMPTY \(E_\text{cut}^\text{empty}\): Analogous to
ENERGY_CUTOFF_OCC
, but for the empty states, i.e. only empty states in the interval \(\varepsilon_a\in[\varepsilon_{a=\text{LUMO}}^{GW},\varepsilon_{a=\text{LUMO}}^{GW}+E_\text{cut}^\text{empty}]\).
The following settings from DFT will also have an influence on the BSE excitation energies:
XC_FUNCTIONAL: Choose between one of the available xc-functionals. The starting point can have a profound influence on the excitation energies [Bruneval2015, Gui2018, Jacquemin2017]. We either recommend the PBE functional as DFT starting point when using BSE@evGW0@PBE or the PBE0 functional, when using BSE@G0W0@PBE0 (see also SELF_CONSISTENCY).
BASIS_SET: Specify the basis set, which affects \(N_\mathrm{empty}\) and thus the size of the matrices \(A_{ia,jb}\) and \(B_{ia,jb}\). The
aug-cc-pVDZ
basis set should be sufficient for most calculations, but needs to be checked regarding convergence, e.g. usingaug-cc-pVTZ
.
The memory consumption of the BSE algorithm is large, it is approximately \(100 \cdot N_\mathrm{occ}^2 N_\mathrm{empty}^2\) Bytes. You can see \(N_\mathrm{occ}\), \(N_\mathrm{empty}\) and the estimated memory consumption from the BSE output. The BSE implementation is well parallelized, i.e. you can use several nodes that can provide the memory.
We have benchmarked the numerical precision of our BSE implementation and we found excellent agreement within only 10 meV compared to the BSE implementation in FHI aims [Liu2020].
The current BSE implementation in CP2K works for molecules. The inclusion of periodic boundary conditions in a Γ-only approach and with full k-point sampling is work in progress.
3. Minimal example for a BSE calculation
3.1 Input file
In this section, we provide a minimal example of a BSE calculation on H2. For the calculation you need the input file BSE_H2.inp and the aug-cc-DZVP basis (Download).
Please copy both files into your working directory and run CP2K by
mpirun -n 1 cp2k.psmp BSE_H2.inp
which requires 12 GB RAM and takes roughly 2 minutes on 1 core. You can find the input and output
file here. We use the basis
sets aug-cc-pVDZ
and aug-cc-pVDZ-RIFIT
from the file BASIS-aug
. These basis sets can be
obtained from the Basis Set Exchange Library:
aug-cc-pVDZ,
aug-cc-pVDZ-RIFIT.
The geometry for H2 was taken from [vanSetten2015].
3.2 Output
The BSE calculation outputs the excitation energies \(\Omega^{(n)}\) for n = 1, …, \(N_\text{exc}^\text{print}\):
BSE| Excitation energies from solving the BSE without the TDA:
BSE|
BSE| Excitation n Multiplet TDA/full BSE Excitation energy Ω (eV)
BSE| 1 Singlet State -full- 11.8621
BSE| 2 Singlet State -full- 12.8264
BSE| 3 Singlet State -full- 15.6415
BSE| 4 Singlet State -full- 15.6415
Afterwards, the single-particle transitions, i.e. the eigenvector elements \(X_{ia}^{(n)}\), with \(|X_{ia}^{(n)}|\) > EPS_X, are printed:
BSE| Excitations are built up by the following single-particle transitions,
BSE| neglecting contributions where |X_ia^n| < 0.10 :
BSE| -- Quick reminder: HOMO i = 1 and LUMO a = 2 --
BSE|
BSE| Excitation n i => a TDA/full BSE |X_ia^n|
BSE|
BSE| 1 1 => 2 -full- 0.6630
BSE| 1 1 => 4 -full- 0.2549
BSE|
BSE| 2 1 => 3 -full- 0.7059
BSE|
BSE| 3 1 => 6 -full- 0.7076
BSE|
BSE| 4 1 => 5 -full- 0.7076
In the case of H2, the lowest excitation n = 1 is mainly built up by a transition from the HOMO (i=1) to the LUMO (a=2), what is apparent from \(X_{i=\text{HOMO},a=\text{LUMO}}^{(n=1)}=0.663\), and also contains a considerable contribution from the 1=>4 (HOMO=>LUMO+2) transition (\(X_{i=\text{HOMO},a=\text{LUMO+2}}^{(n=1)}=0.2549\) ).
3.3 Large scale calculations
Going to larger systems is a challenge for a GW+BSE-calculation, since the memory consumption increases with \(N_\mathrm{occ}^2 N_\mathrm{empty}^2\). The used \(N_\mathrm{occ}\), \(N_\mathrm{empty}\) and the required memory of a calculation are printed in the output file to estimate the memory consumption. We recommend to use several nodes to provide the required memory. In the following, we provide a sample output of a BSE calculation on a nanographene with 206 atoms which has a memory requirement of 2.2 TB RAM:
BSE| Cutoff occupied orbitals [eV] 80.000
BSE| Cutoff empty orbitals [eV] 10.000
BSE| First occupied index 155
BSE| Last empty index (not MO index!) 517
BSE| Energy of first occupied index [eV] -24.774
BSE| Energy of last empty index [eV] 7.863
BSE| Energy difference of first occupied index to HOMO [eV] 19.487
BSE| Energy difference of last empty index to LUMO [eV] 9.639
BSE| Number of GW-corrected occupied MOs 400
BSE| Number of GW-corrected empty MOs 600
BSE|
BSE| Total peak memory estimate from BSE [GB] 2221.591
BSE| Peak memory estimate per MPI rank from BSE [GB] 4.628
To enable BSE calculations on large molecules, we recommend to use a large supercomputer and setting the keywords ENERGY_CUTOFF_OCC and ENERGY_CUTOFF_EMPTY, see details given above.