Department of Chemistry and Chemical Biology, Northeastern University, Boston, Massachusetts 02115, United States.
Acc Chem Res. 2022 Jul 19;55(14):1972-1984. doi: 10.1021/acs.accounts.2c00288. Epub 2022 Jul 7.
Photochemical reactions are of great importance in chemistry, biology, and materials science because they take advantage of a renewable energy source, mild reaction conditions, and high atom economy. Light absorption can excite molecules to a higher energy electronic state of the same spin multiplicity. The following nonadiabatic processes induce molecular transformations that afford exotic molecular architectures and high-energy-isomers that are inaccessible by thermal means. Computational simulations now complement time-resolved instrumentation to reveal ultrafast excited-state mechanistic information for photochemical reactions that is essential in disentangling elusive spectroscopic features, excited-state lifetimes, and excited-state mechanistic critical points. Nonadiabatic molecular dynamics (NAMD), powered by surface hopping techniques, is among the most widely applied techniques to model the photochemical reactions of medium-sized molecules. However, the computational efficiency is limited because of the requisite thousands of multiconfigurational quantum-chemical calculations multiplied by hundreds of trajectories. Machine learning (ML) has emerged as a revolutionary force in computational chemistry to predict the outcome of the resource-intensive multiconfigurational calculations on the fly. An ML potential trained with a substantial set of quantum-chemical calculations can predict the energies and forces with errors under chemical accuracy at a negligible cost. The integration of ML potentials in NAMD dramatically extends the maximum simulation time scale by ∼10 000-fold to the nanosecond regime.In this Account, we present a comprehensive demonstration of ML photodynamics simulations and summarize our most recent applications in resolving complex photochemical reactions. First, we address three fundamental components of ML techniques for photodynamics simulations: the quantum-chemical data set, the ML potential, and NAMD. Second, we describe best practices in building training data and our procedure toward training the ML photodynamics model with our recent literature contributions. We introduce a convenient training data generation scheme combining Wigner sampling and geometrical interpolation. It trains reliable and effective ML potentials suitable for subsequent active learning to detect undersampled data. We demonstrate how active learning automatically discovers new mechanistic pathways and reproduces experimental results. We point out that atomic permutation is an essential data augmentation approach to improve the learnability of distance-based molecular descriptors for highly symmetric molecules. Third, we demonstrate the utility of ML-photodynamics by showing the results of ML photodynamics simulations of (1) photo-torquoselective 4π disrotatory electrocyclic ring closing of norbornyl cyclohexadiene, which reveals a thermal conversion from experimentally unobserved intermediates to the reactant in 1 ns; (2) [2 + 2] photocycloaddition of substituted [3]--ladderdienes in competition with 4π and 6π electrocyclic ring-opening reactions, uncovering substituent effects to explain the reported increased quantum yield of substituted cubane precursors; and (3) photochemical 4π disrotatory electrocyclic reactions of fluorobenzenes in nanoseconds with XMS-CASPT2-level training data. We expect this Account to broaden understanding of ML photodynamics and inspire future developments and applications to increasingly large molecules within complex environments on long time scales.
光化学反应在化学、生物学和材料科学中具有重要意义,因为它们利用可再生能源、温和的反应条件和高原子经济性。光吸收可以将分子激发到相同自旋多重性的更高能量电子态。以下非绝热过程诱导分子转化,提供奇特的分子结构和高能异构体,这些结构和异构体是通过热手段无法获得的。计算模拟现在与时间分辨仪器相结合,为光化学反应提供超快激发态机制信息,这对于阐明难以捉摸的光谱特征、激发态寿命和激发态机制关键点至关重要。基于表面跳跃技术的非绝热分子动力学(NAMD)是模拟中分子光化学反应最广泛应用的技术之一。然而,由于需要数千次多组态量子化学计算乘以数百条轨迹,因此计算效率受到限制。机器学习(ML)已成为计算化学中的一项革命性技术,可实时预测资源密集型多组态计算的结果。使用大量量子化学计算训练的 ML 势可以以微不足道的成本以化学精度预测能量和力的误差。ML 势在 NAMD 中的集成将最大模拟时间尺度从纳秒扩展到纳秒,扩展了约 10000 倍。
在本报告中,我们全面展示了 ML 光动力学模拟,并总结了我们在解决复杂光化学反应方面的最新应用。首先,我们介绍了光动力学模拟中 ML 技术的三个基本组成部分:量子化学数据集、ML 势和 NAMD。其次,我们描述了构建训练数据的最佳实践以及我们使用最近的文献贡献训练 ML 光动力学模型的过程。我们介绍了一种方便的训练数据生成方案,该方案结合了维格纳采样和几何插值。它训练可靠且有效的 ML 势,适用于后续主动学习以检测欠采样数据。我们展示了主动学习如何自动发现新的机制途径并重现实验结果。我们指出,原子置换是一种必不可少的数据增强方法,可提高高度对称分子中基于距离的分子描述符的可学习性。第三,我们通过展示 ML 光动力学模拟的结果来展示 ML 光动力学的实用性,这些结果包括:(1) 降冰片二烯的光扭转选择性 4π 重排电环化环闭,揭示了实验上未观察到的中间体在 1 ns 内转化为反应物的热转化;(2)[2+2]取代[3]--ladderdiene 的光环加成与 4π 和 6π 电环开反应竞争,揭示取代基效应对解释报道的取代立方烷前体量子产率增加的解释;(3) 氟苯在纳秒内进行 4π 重排电环化反应,XMS-CASPT2 级别的训练数据。我们希望本报告能够拓宽对 ML 光动力学的理解,并激发未来的发展和应用,以在更长的时间范围内处理复杂环境中的越来越大的分子。