Porcelli Francesco, Filippone Francesco, Colasante Emanuela, Mattioli Giuseppe
Istituto di Struttura della Materia, Consiglio Nazionale delle Ricerche (ISM-CNR), Strada Provinciale 35d/9, Montelibretti 00010, Italy.
Istituto di Struttura della Materia, Consiglio Nazionale delle Ricerche (ISM-CNR), Via del Fosso del Cavaliere 100, 00133 Rome, Italy.
J Chem Phys. 2025 Jun 28;162(24). doi: 10.1063/5.0272583.
Photoemission measurements in the gas phase at low pressure have enabled the exploration of the intricate relationship between electronic and structural properties at the single-molecule level. Experimental data collected from isolated molecules, free from interactions with other species, have provided an ideal testing ground for developing ab initio simulations capable of interpreting and predicting photoemission spectra. In particular, accurate computational methods for determining atom- and site-specific core ionization binding energies (BEs) facilitate experimental data interpretation, enabling the assignment of contributions from non-equivalent atoms of the same species, even when spectral features remain unresolved due to molecular structure. In this context, we have developed, extensively tested, and made widely available a computational protocol based on plane wave/pseudopotential density functional theory (PW-DFT) within a ΔSCF framework to predict x-ray photoemission spectra (XPS) of isolated molecules. Moreover, we have preliminarily tested and demonstrated the applicability of the same method to large molecular aggregates and thin molecular films deposited on inorganic substrates. The protocol has been assessed using a representative set of semilocal and hybrid density functionals with increasing fractions of Hartree-Fock exact exchange (EXX), including PBE, B3LYP (20% EXX), HSE (range-separated with 25% EXX at short range), and BH&HLYP (50% EXX). As a benchmark, we have also employed the equation-of-motion coupled-cluster method with single and double excitations. Our protocol has been validated across a diverse range of molecular classes-including aromatic, heteroaromatic, and aliphatic compounds; drugs; and biomolecules-demonstrating high accuracy and robustness, even when using semilocal DFT. In addition, valence photoemission measurements complement core photoemission by providing insights into delocalized and π-conjugated molecular orbitals. These measurements are particularly useful for studying chemical modifications in large molecules mediated by non-covalent interactions. Using the same set of density functionals, we have evaluated their capability to predict valence-shell ionization spectra, employing Kohn-Sham eigenvalues as estimators. Finally, our PW-DFT dataset of C1s, N1s, and O1s BEs has been used to train machine learning (ML) models for predicting XPS spectra of isolated organic molecules based on their structure. To ensure reproducibility and encourage the adoption of our protocol, we have made available a public repository containing pseudopotentials, input files for ab initio calculations, and datasets used for ML model training.
低压气相中的光电子发射测量能够在单分子水平上探索电子性质与结构性质之间的复杂关系。从孤立分子收集的实验数据,不受与其他物种相互作用的影响,为开发能够解释和预测光电子发射光谱的从头算模拟提供了理想的测试平台。特别是,用于确定原子和位点特异性核心电离结合能(BEs)的精确计算方法有助于解释实验数据,即使由于分子结构导致光谱特征仍未解析,也能确定同一物种中非等效原子的贡献。在此背景下,我们在ΔSCF框架内开发、广泛测试并广泛提供了一种基于平面波/赝势密度泛函理论(PW-DFT)的计算协议,以预测孤立分子的X射线光电子能谱(XPS)。此外,我们还初步测试并证明了该方法对沉积在无机衬底上的大分子聚集体和分子薄膜的适用性。该协议已使用一组具有代表性的半局域和杂化密度泛函进行评估,这些泛函具有越来越高的Hartree-Fock精确交换(EXX)分数,包括PBE、B3LYP(20% EXX)、HSE(短程分离25% EXX的范围分离)和BH&HLYP(50% EXX)。作为基准,我们还采用了含单双激发的运动方程耦合簇方法。我们的协议已在包括芳香族、杂芳香族和脂肪族化合物、药物和生物分子在内的各种分子类别中得到验证,即使使用半局域DFT也显示出高精度和稳健性。此外,价带光电子发射测量通过提供对离域和π共轭分子轨道的洞察来补充芯能级光电子发射。这些测量对于研究由非共价相互作用介导的大分子中的化学修饰特别有用。使用同一组密度泛函,我们以Kohn-Sham本征值作为估计量,评估了它们预测价带电离光谱的能力。最后,我们的C1s、N1s和O1s BEs的PW-DFT数据集已用于训练机器学习(ML)模型,以根据孤立有机分子的结构预测其XPS光谱。为确保可重复性并鼓励采用我们的协议,我们提供了一个公共存储库,其中包含赝势、从头算计算的输入文件以及用于ML模型训练的数据集。