GW + Bethe-Salpeter equation

In this section, we discuss the basics for computing optical properties of molecules using the Bethe-Salpeter equation (BSE) in CP2K. The BSE enables the computation of electronic excitation energies and optical absorption spectra, for a review, see [Blase2018, Blase2020, Bruneval2015, Sander2015]. In this howto, we describe in Sec. 1 the theory and implementation of BSE, in Sec. 2 the BSE input keywords and in Sec. 3 a full CP2K input file of a BSE calculation and the corresponding output.

1. Theory and implementation of BSE

A central goal of a BSE calculation is to compute electronic excitation energies \(\Omega^{(n)}, n=1,2,\ldots\) (cf. Refs. [Blase2018, Blase2020] for more usecases and details).

The following ingredients are necessary for computing \(\Omega^{(n)}\):

  • Occupied Kohn-Sham (KS) orbitals \(\varphi_i(\mathbf{r})\) and empty KS orbitals \(\varphi_a(\mathbf{r})\) from a DFT calculation, where \(i=1,\ldots,N_\mathrm{occ}\) and \(a=N_\mathrm{occ}+1,\ldots,N_\mathrm{occ}+N_\mathrm{empty}\),

  • \(GW\) eigenvalues \(\varepsilon_i^{GW}\) and \(\varepsilon_a^{GW}\) of corresponding KS orbitals.

In CP2K, it is possible to use \(G_0W_0\), ev\(GW_0\) or ev\(GW\) eigenvalues, see details in GW and in Ref. [Golze2019], i.e. we perform BSE@\(G_0W_0\)/ev\(GW_0\)/ev\(GW\)@DFT. Thus, also input parameters for a DFT and \(GW\) calculation can be given (see full input in Sec. 3.1). We obtain optical properties from BSE solving the following generalized eigenvalue problem that involves the block matrix \(ABBA\):

\[\begin{split}\left( \begin{array}{cc}A & B\\B & A\end{array} \right)\left( \begin{array}{cc}\mathbf{X}^{(n)}\\\mathbf{Y}^{(n)}\end{array} \right) = \Omega^{(n)}\left(\begin{array}{cc}1&0\\0&-1\end{array}\right)\left(\begin{array}{cc}\mathbf{X}^{(n)}\\\mathbf{Y}^{(n)}\end{array}\right) \quad .\end{split}\]

We abbreviate \(A\) and \(B\) as matrices with index \(A_{ia,jb}\), i.e. they have \(N_\mathrm{occ}N_\mathrm{empty}\) rows and \(N_\mathrm{occ}N_\mathrm{empty}\) columns. The entries of \(A\) and \(B\) are given by [Blase2018]

\[\begin{split} \begin{align} A_{ia,jb} &= (\varepsilon_a^{GW}-\varepsilon_i^{GW})\delta_{ij}\delta_{ab} + \alpha^\mathrm{S/T} v_{ia,jb} - W_{ij,ab}(\omega=0) \quad ,\\ B_{ia,jb} &= \alpha^\mathrm{(S/T)} v_{ia,bj} - W_{ib,aj}(\omega=0) \quad . \end{align}\end{split}\]

where \(\delta_{ij}\) is the Kronecker delta. The user sets \(\alpha^S=2\) for computing singlet excitations and \(\alpha^T=0\) for computing triplet excitations. \(v_{pq,rs}\) is the bare Coulomb interaction and \(W_{pq,rs}(\omega=0)\) the statically (\(\omega=0\)) screened Coulomb interaction , where \(p,q,r,s \in [ 1, N_\mathrm{occ}+N_\mathrm{empty}]\) are KS orbital indices. \((\mathbf{X}^{(n)},\mathbf{Y}^{(n)})\) with elements \(X_{ia}^{(n)}\) and \(Y_{ia}^{(n)}\) are the eigenvectors of the excitation \(n\) which relate to the wavefunction of the electronic excitation [Blase2020],

\[ \begin{align} \Psi_\text{excitation}^{(n)}(\mathbf{r}_e,\mathbf{r}_h) = \sum_{ia} X_{ia}^{(n)} \varphi_i(\mathbf{r}_h) \varphi_a(\mathbf{r}_e) + Y_{ia}^{(n)} \varphi_i(\mathbf{r}_e) \varphi_a(\mathbf{r}_h) \quad , \end{align}\]

i.e. \(X_{ia}^{(n)}\) and \(Y_{ia}^{(n)}\) describe the transition amplitude between occupied orbital \(\varphi_i\) and empty orbital \(\varphi_a\) of the \(n\)-th excitation.

the Tamm-Dancoff approximation (TDA) is also implemented, which constrains \(B=0\). In case \(A\) is positive definite, and excitation energies \(\Omega^{(n)}>0\), we have \(\mathbf{Y}=0\) and \(\mathbf{X}\) can be computed from the Hermitian eigenvalue problem

\[ A \mathbf{X}^{(n)}_\mathbf{TDA} = \Omega^{(n)}_\mathbf{TDA} \mathbf{X}^{(n)}_\mathbf{TDA} \quad .\]

Diagonalizing \(A\) in TDA, or the full block-matrix \(ABBA\), takes in the order of \((N_\mathrm{occ} N_\mathrm{empty})^3\) floating point operations. This translates to a computational scaling of \(O(N^6)\) in the system size \(N\), e.g. the number of electrons.

2. BSE input

For starting a BSE calculation one needs to set the RUN_TYPE to ENERGY and the following sections for GW and BSE:

  SELF_CONSISTENCY      G0W0      ! can be changed to EV_GW0 or EV_GW
    TDA                 TDA+ABBA  ! Diagonalizing ABBA and A
    NUM_PRINT_EXC       20        ! Number of printed excitations
    ENERGY_CUTOFF_OCC   -1        ! Set to positive numbers (eV) to
    ENERGY_CUTOFF_EMPTY -1        ! truncate matrices A_ia,jb and B_ia,jb

In the upper GW/BSE section, the following keywords have been used:

  • SELF_CONSISTENCY: Determines which GW self-consistency (\(G_0W_0\), ev\(GW_0\) or ev\(G_0W_0\)) is used to calculate the single-particle GW energies \(\varepsilon_p^{GW}\) needed in the BSE calculation.

  • TDA: Three options available:

    • ON diagonalize of \(A\),

    • OFF Generalized diagonalization of \(ABBA\),

    • TDA+ABBA CP2K diagonalizes \(ABBA\) as well as \(A\).

  • SPIN_CONFIG: Two options available:

    • SINGLET for computing singlet excitation energies (\(\alpha^S=2\)),

    • TRIPLET for computing triplet excitation energies (\(\alpha^T=0\)).

  • NUM_PRINT_EXC: Number of excitations \(N_\text{exc}^\text{print}\) to be printed.

  • ENERGY_CUTOFF_OCC \(E_\text{cut}^\text{occ}\): Restrict occupied molecular orbital (MO) indices \(i\) and only use occupied MOs with \(\varepsilon_i\in[\varepsilon_{i=\text{HOMO}}^{GW}-E_\text{cut}^\text{occ},\varepsilon_{i=\text{HOMO}}^{GW}]\). Setting a small ENERGY_CUTOFF_OCC drastically reduces the computation time and the memory consumption, but also might affect the computed excitation energies \(\Omega^{(n)}\). Recommended to use for large systems with more than 30 atoms, but we recommend a careful convergence test by increasing ENERGY_CUTOFF_OCC and observing the effect on \(\Omega^{(n)}\) [Liu2020].

  • ENERGY_CUTOFF_EMPTY \(E_\text{cut}^\text{empty}\): Analogous to ENERGY_CUTOFF_OCC, but for the empty states, i.e. only empty states in the interval \(\varepsilon_a\in[\varepsilon_{a=\text{LUMO}}^{GW},\varepsilon_{a=\text{LUMO}}^{GW}+E_\text{cut}^\text{empty}]\).

In addition to these keywords, the following settings from DFT will have an influence on the BSE excitation energies:

  • XC_FUNCTIONAL: Choose between one of the available xc-functionals. The starting point can have a profound influence on the excitation energies [Bruneval2015, Gui2018, Jacquemin2017]. We either recommend the PBE functional as DFT starting point when using BSE@ev\(GW_0\)@PBE or the PBE0 functional, when using BSE@\(G_0W_0\)@PBE0 (see also SELF_CONSISTENCY).

  • BASIS_SET: Specify the basis set, which affects \(N_\mathrm{empty}\) and thus the size of the matrices \(A_{ia,jb}\) and \(B_{ia,jb}\). The aug-cc-pVDZ basis set should be sufficient for most calculations, but needs to be checked regarding convergence, e.g. using aug-cc-pVTZ.

The memory consumption of the BSE algorithm is large, it is approximately \(100 \cdot N_\mathrm{occ}^2 N_\mathrm{empty}^2\) Bytes. You can see \(N_\mathrm{occ}\), \(N_\mathrm{empty}\) and the estimated memory consumption from the BSE output. The BSE implementation is well parallelized, i.e. you can use several nodes that can provide the memory.

We have benchmarked the numerical precision of our BSE implementation and we found excellent agreement within only 10 meV compared to the BSE implementation in FHI aims [Liu2020].

The current BSE implementation in CP2K works for molecules. The inclusion of periodic boundary conditions in a \(\Gamma\)-only approach and with full \(k\)-point sampling is work in progress.

3. Minimal example for a BSE calculation

3.1 Input file

In this section, we provide a minimal example of a BSE calculation on \(\mathrm{H}_2\). For the calculation you need the input file BSE_H2.inp and the aug-cc-DZVP basis (Download).

Please copy both files into your working directory and run CP2K by

mpirun -n 1 cp2k.psmp BSE_H2.inp

which requires 12 GB RAM and takes roughly 2 minutes on 1 core. You can find the output file also in the Download.

  METHOD Quickstep
    BASIS_SET_FILE_NAME BASIS-aug               ! Custom Basis set file (aug-cc-pVDZ and aug-cc-pVDZ-RIFIT 
    POTENTIAL_FILE_NAME POTENTIAL               ! from the Basis Set Exchange library -
      METHOD GAPW                               ! All electron calculation
      EPS_DEFAULT 1.0E-16
      EPS_PGF_ORB 1.0E-16
    &END QS
      EPS_SCF 1e-7
    &END SCF
      &XC_FUNCTIONAL PBE0                       ! Choice of functional has a profound influence on the results
        &RI_RPA                                 ! In the RI_RPA and the GW section, additional numerical parameters, e.g.
            SELF_CONSISTENCY      G0W0          ! can be changed to EV_GW0 or EV_GW
              TDA                 TDA+ABBA      ! Diagonalizing ABBA and A
              SPIN_CONFIG         SINGLET       ! or TRIPLET
              NUM_PRINT_EXC       20            ! Number of printed excitations
              ENERGY_CUTOFF_OCC   -1            ! Set to positive numbers (eV) to
              ENERGY_CUTOFF_EMPTY -1            ! truncate matrices A_ia,jb and B_ia,jb
            &END BSE
          &END GW
        &END RI_RPA
    &END XC
      ABC 20 20 20
      H 0.0000 0.0000 0.0000                    ! H2 molecule geometry from GW100 Paper
      H 0.0000 0.0000 0.74144
    &KIND H
      BASIS_SET ORB    aug-cc-pVDZ              ! For production runs, the basis set should be checked for convergence.
      BASIS_SET RI_AUX aug-cc-pVDZ-RIFIT        ! In general, pVDZ should be a solid choice.
    &KIND O
      BASIS_SET ORB    aug-cc-pVDZ

The basis sets aug-cc-pVDZ and aug-cc-pVDZ-RIFIT in BASIS-aug can be obtained from the Basis Set Exchange Library: aug-cc-pVDZ, aug-cc-pVDZ-RIFIT. The geometry for \(\mathrm{H}_2\) was taken from vanSetten2015.

3.2 Output

The BSE calculation outputs the excitation energies \(\Omega^{(n)}\) for n = 1, …, \(N_\text{exc}^\text{print}\):

 BSE| Excitation energies from solving the BSE without the TDA:
 BSE|     Excitation n        Multiplet  TDA/full BSE   Excitation energy Ω (eV)
 BSE|                1    Singlet State        -full-                    11.8621
 BSE|                2    Singlet State        -full-                    12.8264
 BSE|                3    Singlet State        -full-                    15.6415
 BSE|                4    Singlet State        -full-                    15.6415

Afterwards, the single-particle transitions, i.e. the eigenvector elements \(X_{ia}^{(n)}\), with \(|X_{ia}^{(n)}|\) > EPS_X, are printed:

 BSE| Excitations are built up by the following single-particle transitions,
 BSE| neglecting contributions where |X_ia^n| <  0.10 :
 BSE|         -- Quick reminder: HOMO i =    1 and LUMO a =    2 --
 BSE| Excitation n      i =>     a               TDA/full BSE           |X_ia^n|
 BSE|            1      1 =>     2                     -full-             0.6630
 BSE|            1      1 =>     4                     -full-             0.2549
 BSE|            2      1 =>     3                     -full-             0.7059
 BSE|            3      1 =>     6                     -full-             0.7076
 BSE|            4      1 =>     5                     -full-             0.7076

In the case of the \(\mathrm{H}_2\), the lowest excitation n = 1 is mainly built up by a transition from the HOMO (i=1) to the LUMO (a=2), what is apparent from \(X_{i=\text{HOMO},a=\text{LUMO}}^{(n=1)}=0.663\), and also contains a considerable contribution from the 1=>4 (HOMO=>LUMO+2) transition (\(X_{i=\text{HOMO},a=\text{LUMO+2}}^{(n=1)}=0.2549\) ).

3.3 Large scale calculations

Going to larger systems is a challenge for a \(GW\)+BSE-calculation, since the memory consumption increases with \(N_\mathrm{occ}^2 N_\mathrm{empty}^2\). The used \(N_\mathrm{occ}\), \(N_\mathrm{empty}\) and the required memory of a calculation are printed in the output file to help you estimating the memory consumption and distributing the calculation on several nodes to provide the memory. In the following, we provide a sample output of a BSE calculation on a nanographene with 206 atoms:

 BSE| Cutoff occupied orbitals [eV]                                       80.000
 BSE| Cutoff empty orbitals [eV]                                          10.000
 BSE| First occupied index                                                   155
 BSE| Last empty index (not MO index!)                                       517
 BSE| Energy of first occupied index [eV]                                -24.774
 BSE| Energy of last empty index [eV]                                      7.863
 BSE| Energy difference of first occupied index to HOMO [eV]              19.487
 BSE| Energy difference of last empty index to LUMO [eV]                   9.639
 BSE| Number of GW-corrected occupied MOs                                    400
 BSE| Number of GW-corrected empty MOs                                       600
 BSE| Total peak memory estimate from BSE [GB]                          2221.591
 BSE| Peak memory estimate per MPI rank from BSE [GB]                      4.628

To enable BSE calculations on large molecules, we recommend to use a large supercomputer and setting the keywords ENERGY_CUTOFF_OCC and ENERGY_CUTOFF_EMPTY, see details given above.