Sigmund Lukas M, S Shree Sowndarya, Albers Andreas, Erdmann Philipp, Paton Robert S, Greb Lutz
Anorganisch-Chemisches Institut, Ruprecht-Karls-Universität Heidelberg, Im Neuenheimer Feld 270, 69120, Heidelberg, Germany.
Department of Chemistry, Colorado State University, 1301 Center Avenue, Fort Collins, CO, 80523, USA.
Angew Chem Int Ed Engl. 2024 Apr 22;63(17):e202401084. doi: 10.1002/anie.202401084. Epub 2024 Mar 19.
"How strong is this Lewis acid?" is a question researchers often approach by calculating its fluoride ion affinity (FIA) with quantum chemistry. Here, we present FIA49k, an extensive FIA dataset with 48,986 data points calculated at the RI-DSD-BLYP-D3(BJ)/def2-QZVPP//PBEh-3c level of theory, including 13 different p-block atoms as the fluoride accepting site. The FIA49k dataset was used to train FIA-GNN, two message-passing graph neural networks, which predict gas and solution phase FIA values of molecules excluded from training with a mean absolute error of 14 kJ mol (r=0.93) from the SMILES string of the Lewis acid as the only input. The level of accuracy is notable, given the wide energetic range of 750 kJ mol spanned by FIA49k. The model's value was demonstrated with four case studies, including predictions for molecules extracted from the Cambridge Structural Database and by reproducing results from catalysis research available in the literature. Weaknesses of the model are evaluated and interpreted chemically. FIA-GNN and the FIA49k dataset can be reached via a free web app (www.grebgroup.de/fia-gnn).
“这种路易斯酸的强度如何?”这是研究人员经常通过量子化学计算其氟离子亲和力(FIA)来探讨的问题。在此,我们展示了FIA49k,这是一个广泛的FIA数据集,包含48986个在RI-DSD-BLYP-D3(BJ)/def2-QZVPP//PBEh-3c理论水平下计算得到的数据点,其中包括13种不同的p区原子作为氟化物接受位点。FIA49k数据集用于训练FIA-GNN,即两个消息传递图神经网络,它们仅以路易斯酸的SMILES字符串作为唯一输入,就能预测训练集中未包含的分子在气相和溶液相中的FIA值,平均绝对误差为14 kJ·mol⁻¹(r = 0.93)。考虑到FIA49k涵盖的750 kJ·mol⁻¹的广泛能量范围,这个准确度是值得注意的。通过四个案例研究展示了该模型的价值,包括对从剑桥结构数据库中提取的分子的预测以及重现文献中催化研究的结果。对该模型的弱点进行了评估并从化学角度进行了解释。可以通过一个免费的网络应用程序(www.grebgroup.de/fia-gnn)访问FIA-GNN和FIA49k数据集。