Medrano Sandonas Leonardo, Hoja Johannes, Ernst Brian G, Vázquez-Mayagoitia Álvaro, DiStasio Robert A, Tkatchenko Alexandre
Department of Physics and Materials Science, University of Luxembourg L-1511 Luxembourg City Luxembourg
Institute of Chemistry, University of Graz 8010 Graz Austria.
Chem Sci. 2023 Aug 18;14(39):10702-10717. doi: 10.1039/d3sc03598k. eCollection 2023 Oct 11.
The rational design of molecules with targeted quantum-mechanical (QM) properties requires an advanced understanding of the structure-property/property-property relationships (SPR/PPR) that exist across chemical compound space (CCS). In this work, we analyze these fundamental relationships in the sector of CCS spanned by small (primarily organic) molecules using the recently developed QM7-X dataset, a systematic, extensive, and tightly converged collection of 42 QM properties corresponding to ≈4.2M equilibrium and non-equilibrium molecular structures containing up to seven heavy/non-hydrogen atoms (including C, N, O, S, and Cl). By characterizing and enumerating progressively more complex manifolds of molecular property space-the corresponding high-dimensional space defined by the properties of each molecule in this sector of CCS-our analysis reveals that one has a substantial degree of flexibility or "freedom of design" when searching for a single molecule with a desired pair of properties or a set of distinct molecules sharing an array of properties. To explore how this intrinsic flexibility manifests in the molecular design process, we used multi-objective optimization to search for molecules with simultaneously large polarizabilities and HOMO-LUMO gaps; analysis of the resulting Pareto fronts identified non-trivial paths through CCS consisting of sequential structural and/or compositional changes that yield molecules with optimal combinations of these properties.
具有目标量子力学(QM)性质的分子的合理设计需要对跨越化合物空间(CCS)的结构-性质/性质-性质关系(SPR/PPR)有深入的理解。在这项工作中,我们使用最近开发的QM7-X数据集分析了CCS中由小分子(主要是有机分子)所涵盖的部分中的这些基本关系,该数据集是一个系统、广泛且紧密收敛的集合,包含42种QM性质,对应于约420万个平衡和非平衡分子结构,这些分子结构包含多达七个重原子/非氢原子(包括C、N、O、S和Cl)。通过表征和枚举分子性质空间中逐渐复杂的流形——CCS这一部分中由每个分子的性质定义的相应高维空间——我们的分析表明,在寻找具有所需一对性质的单个分子或共享一系列性质的一组不同分子时,人们具有很大程度的灵活性或“设计自由度”。为了探索这种内在灵活性在分子设计过程中是如何体现的,我们使用多目标优化来寻找同时具有大极化率和HOMO-LUMO能隙的分子;对所得帕累托前沿的分析确定了通过CCS的非平凡路径,这些路径由连续的结构和/或组成变化组成,从而产生具有这些性质最佳组合的分子。