Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States.
Department of Chemistry, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States.
Acc Chem Res. 2021 Feb 2;54(3):532-545. doi: 10.1021/acs.accounts.0c00686. Epub 2021 Jan 22.
The variability of chemical bonding in open-shell transition-metal complexes not only motivates their study as functional materials and catalysts but also challenges conventional computational modeling tools. Here, tailoring ligand chemistry can alter preferred spin or oxidation states as well as electronic structure properties and reactivity, creating vast regions of chemical space to explore when designing new materials atom by atom. Although first-principles density functional theory (DFT) remains the workhorse of computational chemistry in mechanism deduction and property prediction, it is of limited use here. DFT is both far too computationally costly for widespread exploration of transition-metal chemical space and also prone to inaccuracies that limit its predictive performance for localized d electrons in transition-metal complexes. These challenges starkly contrast with the well-trodden regions of small-organic-molecule chemical space, where the analytical forms of molecular mechanics force fields and semiempirical theories have for decades accelerated the discovery of new molecules, accurate DFT functional performance has been demonstrated, and gold-standard methods from correlated wavefunction theory can predict experimental results to chemical accuracy.The combined promise of transition-metal chemical space exploration and lack of established tools has mandated a distinct approach. In this Account, we outline the path we charted in exploration of transition-metal chemical space starting from the first machine learning (ML) models (i.e., artificial neural network and kernel ridge regression) and representations for the prediction of open-shell transition-metal complex properties. The distinct importance of the immediate coordination environment of the metal center as well as the lack of low-level methods to accurately predict structural properties in this coordination environment first motivated and then benefited from these ML models and representations. Once developed, the recipe for prediction of geometric, spin state, and redox potential properties was straightforwardly extended to a diverse range of other properties, including in catalysis, computational "feasibility", and the gas separation properties of periodic metal-organic frameworks. Interpretation of selected features most important for model prediction revealed new ways to encapsulate design rules and confirmed that models were robustly mapping essential structure-property relationships. Encountering the special challenge of ensuring that good model performance could generalize to new discovery targets motivated investigation of how to best carry out model uncertainty quantification. Distance-based approaches, whether in model latent space or in carefully engineered feature space, provided intuitive measures of the domain of applicability. With all of these pieces together, ML can be harnessed as an engine to tackle the large-scale exploration of transition-metal chemical space needed to satisfy multiple objectives using efficient global optimization methods. In practical terms, bringing these artificial intelligence tools to bear on the problems of transition-metal chemical space exploration has resulted in ML-model assessments of large, multimillion compound spaces in minutes and validated new design leads in weeks instead of decades.
开壳过渡金属配合物中化学键的多变性不仅激发了人们对其作为功能材料和催化剂的研究,而且对传统的计算建模工具提出了挑战。在这里,配体化学的定制可以改变优先的自旋或氧化态以及电子结构性质和反应性,从而在设计新的原子级材料时创造出广阔的化学空间区域。尽管第一性原理密度泛函理论(DFT)仍然是计算化学在机制推导和性质预测中的主力,但在这种情况下它的用途有限。DFT 对于广泛探索过渡金属化学空间来说计算成本过高,而且由于其对过渡金属配合物中局域 d 电子的预测性能存在误差,因此也限制了其应用。这些挑战与小分子化学空间的成熟区域形成鲜明对比,在小分子化学空间中,分子力学力场的解析形式和半经验理论几十年来一直在加速新分子的发现,证明了准确的 DFT 功能性能,并且相关波函数理论的金标准方法可以预测实验结果到化学精度。过渡金属化学空间探索的综合前景和缺乏既定工具的现状,要求我们采取独特的方法。在本报告中,我们概述了我们在探索过渡金属化学空间时所采用的方法,从用于预测开壳过渡金属配合物性质的第一个机器学习(ML)模型(即人工神经网络和核岭回归)和表示开始。金属中心的直接配位环境的显著重要性以及缺乏准确预测该配位环境结构性质的低级方法,首先推动了这些 ML 模型和表示的发展,然后又从中受益。一旦开发出来,预测几何形状、自旋状态和氧化还原电位性质的方法就可以直接扩展到其他各种性质,包括催化、计算“可行性”以及周期性金属有机骨架的气体分离性质。对模型预测最重要的选定特征的解释揭示了封装设计规则的新方法,并证实了模型能够稳健地映射基本的结构-性质关系。为了确保良好的模型性能能够推广到新的发现目标,我们遇到了特殊的挑战,这促使我们研究如何最好地进行模型不确定性量化。基于距离的方法,无论是在模型潜在空间还是在精心设计的特征空间中,都为适用域提供了直观的度量。所有这些都结合在一起,使得机器学习可以被用作引擎,通过使用高效的全局优化方法来满足多个目标,从而对过渡金属化学空间进行大规模探索。实际上,将这些人工智能工具应用于过渡金属化学空间探索的问题,使得我们能够在几分钟内对包含数百万种化合物的大型空间进行 ML 模型评估,并在数周内而不是数十年内验证新的设计线索。