Worakul Thanapat, Laplaza Rubén, Blaskovits J Terence, Corminboeuf Clémence
Laboratory for Computational Molecular Design, Institute of Chemical Sciences and Engineering, Ecole Polytechnique Fédérale de Lausanne (EPFL) 1015 Lausanne Switzerland
National Center for Competence in Research-Catalysis (NCCR-Catalysis), École Polytechnique Fédérale de Lausanne (EPFL) 1015 Lausanne Switzerland.
Chem Sci. 2025 Aug 25. doi: 10.1039/d5sc03184b.
Recently, we leveraged the FORMED repository made up of 116 687 synthesizeable molecules to deploy fragment-based high-throughput virtual screening (HTVS) and genetic algorithm (GA) searches of singlet fission (SF) molecular candidates. With these approaches, both prototypical (, acenes, boron-dipyrromethane (BODIPY)) and unreported (, heteroatom-rich mesoionic) classes of chromophore candidates fulfilling specific SF energetic requirements were identified. Yet, the reliance on predefined fragments limits chemical space exploration and, thus, the discovery of truly unforeseen molecular cores. Here, we exploit FORMED to train a generative learning framework driven by reinforcement learning and property predictions. The generative model rediscovers a diverse range of previously reported SF chromophore classes, including polyenes, benzofurans, fulvenoids and quinoidal systems, but also suggests an unexpected scaffold absent from the training data, neocoumarin (2-benzopyran-3-one), characterized by two endocyclic double bonds in an arrangement and capped by a lactone group. An in-depth investigation reveals a diradicaloid behavior over the conjugated core comparable to 2-benzofuran, a widely known SF compound. This work highlights the potential of using both generative and property prediction models to discover candidates beyond derivatives of known chemistry for tailored material applications.
最近,我们利用由116687个可合成分子组成的FORMED数据库,开展了基于片段的高通量虚拟筛选(HTVS)以及对单重态裂变(SF)分子候选物的遗传算法(GA)搜索。通过这些方法,我们识别出了满足特定SF能量要求的典型(如并苯、硼二吡咯甲烷(BODIPY))和未报道过的(如富含杂原子的中氮茚)发色团候选物类别。然而,对预定义片段的依赖限制了化学空间的探索,进而限制了对真正不可预见分子核心的发现。在此,我们利用FORMED训练了一个由强化学习和性质预测驱动的生成学习框架。该生成模型重新发现了一系列此前报道过的SF发色团类别,包括多烯、苯并呋喃、富烯类化合物和醌类体系,同时还提出了一种训练数据中不存在的意外骨架——新香豆素(2-苯并吡喃-3-酮),其特征是在特定排列中有两个内环双键,并由一个内酯基团封端。深入研究表明,其共轭核心具有与2-苯并呋喃(一种广为人知的SF化合物)相当的双自由基行为。这项工作突出了使用生成模型和性质预测模型来发现超越已知化学衍生物的候选物以用于定制材料应用的潜力。