Cho Yuri, Laplaza Ruben, Vela Sergi, Corminboeuf Clémence
Laboratory for Computational Molecular Design, Institute of Chemical Sciences and Engineering, École Polytechnique Fédérale de Lausanne Lausanne Switzerland
National Centre for Computational Design and Discovery of Novel Materials (MARVEL), École Polytechnique Fédérale de Lausanne Lausanne Switzerland.
Digit Discov. 2024 Jul 12;3(8):1638-1647. doi: 10.1039/d4dd00093e. eCollection 2024 Aug 7.
Exploiting crystallographic data repositories for large-scale quantum chemical computations requires the rapid and accurate extraction of the molecular structure, charge and spin from the crystallographic information file. Here, we develop a general approach to assign the ground state spin of transition metal complexes, in complement to our previous efforts on determining metal oxidation states and bond order within the software. Starting from a database of 31k transition metal complexes extracted from the Cambridge Structural Database with , we construct the TM-GSspin dataset, which contains 2063 mononuclear first row transition metal complexes and their computed ground state spins. TM-GSspin is highly diverse in terms of metals, metal oxidation states, coordination geometries, and coordination sphere compositions. Based on TM-GSspin, we identify correlations between structural and electronic features of the complexes and their ground state spins to develop a rule-based spin state assignment model. Leveraging this knowledge, we construct interpretable descriptors and build a statistical model achieving 98% cross-validated accuracy in predicting the ground state spin across the board. Our approach provides a practical way to determine the ground state spin of transition metal complexes directly from crystal structures without additional computations, thus enabling the automated use of crystallographic data for large-scale computations involving transition metal complexes.
利用晶体学数据存储库进行大规模量子化学计算,需要从晶体学信息文件中快速准确地提取分子结构、电荷和自旋。在此,我们开发了一种通用方法来确定过渡金属配合物的基态自旋,作为我们之前在软件中确定金属氧化态和键级工作的补充。从从剑桥结构数据库中提取的31k个过渡金属配合物的数据库出发,我们构建了TM-GSspin数据集,其中包含2063个单核第一行过渡金属配合物及其计算得到的基态自旋。TM-GSspin在金属、金属氧化态、配位几何结构和配位球组成方面具有高度多样性。基于TM-GSspin,我们确定了配合物的结构和电子特征与其基态自旋之间的相关性,以开发基于规则的自旋态分配模型。利用这些知识,我们构建了可解释的描述符,并建立了一个统计模型,在全面预测基态自旋时交叉验证准确率达到98%。我们的方法提供了一种直接从晶体结构确定过渡金属配合物基态自旋的实用方法,无需额外计算,从而能够将晶体学数据自动用于涉及过渡金属配合物的大规模计算。