Daicel Corporation, Kita-ku, 530-0011 Osaka, Japan.
The Institute of Statistical Mathematics, Research Organization of Information and Systems, Tachikawa, Tokyo 190-8562, Japan.
J Chem Inf Model. 2023 Sep 11;63(17):5539-5548. doi: 10.1021/acs.jcim.3c00329. Epub 2023 Aug 21.
Recent advances in machine learning have led to the rapid adoption of various computational methods for de novo molecular design in polymer research, including high-throughput virtual screening and inverse molecular design. In such workflows, molecular generators play an essential role in creation or sequential modification of candidate polymer structures. Machine learning-assisted molecular design has made great technical progress over the past few years. However, the difficulty of identifying synthetic routes to such designed polymers remains unresolved. To address this technical limitation, we present Small Molecules into Polymers (SMiPoly), a Python library for virtual polymer generation that implements 22 chemical rules for commonly applied polymerization reactions. For given small organic molecules to form a candidate monomer set, the SMiPoly generator conducts possible polymerization reactions to generate an exhaustive list of potentially synthesizable polymers. In this study, using 1083 readily available monomers, we generated 169,347 unique polymers forming seven different molecular types: polyolefin, polyester, polyether, polyamide, polyimide, polyurethane, and polyoxazolidone. By comparing the distribution of the virtually created polymers with approximately 16,000 real polymers synthesized so far, it was found that the coverage and novelty of the SMiPoly-generated polymers can reach 48 and 53%, respectively. Incorporating the SMiPoly library into a molecular design workflow will accelerate the process of de novo polymer synthesis by shortening the step to select synthesizable candidate polymers.
近年来,机器学习的发展使得各种计算方法在聚合物研究中的从头分子设计中得到了广泛应用,包括高通量虚拟筛选和逆分子设计。在这些工作流程中,分子生成器在候选聚合物结构的创建或顺序修改中起着至关重要的作用。在过去的几年中,机器学习辅助的分子设计取得了巨大的技术进步。然而,设计聚合物的合成途径的困难仍然没有得到解决。为了解决这个技术限制,我们提出了 Small Molecules into Polymers(SMiPoly),这是一个用于虚拟聚合物生成的 Python 库,它实现了 22 种常用于聚合反应的化学规则。对于给定的小分子以形成候选单体集,SMiPoly 生成器可以进行可能的聚合反应,以生成潜在可合成聚合物的详尽列表。在这项研究中,使用 1083 种易得的单体,我们生成了 169347 种独特的聚合物,形成了七种不同的分子类型:聚烯烃、聚酯、聚醚、聚酰胺、聚酰亚胺、聚氨酯和聚恶唑烷酮。通过比较虚拟创建的聚合物的分布与迄今为止合成的大约 16000 种真实聚合物,发现 SMiPoly 生成的聚合物的覆盖率和新颖性分别可以达到 48%和 53%。将 SMiPoly 库纳入分子设计工作流程将通过缩短选择可合成候选聚合物的步骤来加速从头聚合物合成的过程。