Azzouzi Mohammed, Bennett Steven, Posligua Victor, Bondesan Roberto, Zwijnenburg Martijn A, Jelfs Kim E
Department of Chemistry, Imperial College London White City Campus W12 0BZ London UK
Laboratory for Computational Molecular Design (LCMD), Institute of Chemical Sciences and Engineering, Ecole Polytechnique Federal de Lausanne (EPFL) 1015 Lausanne Switzerland
Digit Discov. 2025 Aug 13. doi: 10.1039/d4dd00355a.
Identifying organic molecules with desirable properties from the extensive chemical space can be challenging, particularly when property evaluation methods are time-consuming and resource-intensive. In this study, we illustrate this challenge by exploring the chemical space of large oligomers, constructed from monomeric building blocks, for potential use in organic photovoltaics (OPV). For this purpose, we developed a python package to search the chemical space using a building block approach: . We use (GitHub link: STK_search) to compare a variety of search algorithms, including those based upon Bayesian optimisation and evolutionary approaches. Initially, we evaluated and compared the performance of different search algorithms within a precomputed search space. We then extended our investigation to the vast chemical space of molecules formed of 6 building blocks (6-mers), comprising over 10 molecules. Notably, while some algorithms show only marginal improvements over a random search approach in a relatively small, precomputed, search space, their performance in the larger chemical space is orders of magnitude better. Specifically, Bayesianoptimisation identified a thousand times more promising molecules with the desired properties compared to random search, using the same computational resources.
从广阔的化学空间中识别具有理想特性的有机分子可能具有挑战性,尤其是当特性评估方法既耗时又耗费资源时。在本研究中,我们通过探索由单体构建块构建的大型低聚物的化学空间以用于有机光伏(OPV),来说明这一挑战。为此,我们开发了一个Python包,使用构建块方法搜索化学空间: 。我们使用 (GitHub链接:STK_search)来比较各种搜索算法,包括基于贝叶斯优化和进化方法的算法。最初,我们在预先计算的搜索空间内评估并比较了不同搜索算法的性能。然后,我们将研究扩展到由6个构建块(六聚体)形成的分子的广阔化学空间,其中包含超过10个分子。值得注意的是,虽然一些算法在相对较小的、预先计算的搜索空间中仅比随机搜索方法略有改进,但其在更大化学空间中的性能要好几个数量级。具体而言,使用相同的计算资源,与随机搜索相比,贝叶斯优化识别出具有所需特性的有前景分子的数量要多一千倍。