Laboratory of Chemoinformatics, UMR 7140 CNRS, University of Strasbourg, 1, rue Blaise Pascal, 67000, Strasbourg, France.
Laboratory of Chemoinformatics and Molecular Modeling, Butlerov Institute of Chemistry, Kazan Federal University, Kremlyovskaya str. 18, 420008, Kazan, Russia.
Sci Rep. 2021 Feb 4;11(1):3178. doi: 10.1038/s41598-021-81889-y.
The "creativity" of Artificial Intelligence (AI) in terms of generating de novo molecular structures opened a novel paradigm in compound design, weaknesses (stability & feasibility issues of such structures) notwithstanding. Here we show that "creative" AI may be as successfully taught to enumerate novel chemical reactions that are stoichiometrically coherent. Furthermore, when coupled to reaction space cartography, de novo reaction design may be focused on the desired reaction class. A sequence-to-sequence autoencoder with bidirectional Long Short-Term Memory layers was trained on on-purpose developed "SMILES/CGR" strings, encoding reactions of the USPTO database. The autoencoder latent space was visualized on a generative topographic map. Novel latent space points were sampled around a map area populated by Suzuki reactions and decoded to corresponding reactions. These can be critically analyzed by the expert, cleaned of irrelevant functional groups and eventually experimentally attempted, herewith enlarging the synthetic purpose of popular synthetic pathways.
人工智能(AI)在生成全新分子结构方面的“创造力”开创了化合物设计的新范例,尽管存在弱点(这些结构的稳定性和可行性问题)。在这里,我们表明,可以成功地教导“有创造力”的 AI 列举出化学计量学上一致的新颖化学反应。此外,当与反应空间绘图结合使用时,可以将全新的反应设计集中在所需的反应类别上。带有双向长短时记忆层的序列到序列自动编码器在专门开发的“SMILES/CGR”字符串上进行训练,对美国专利商标局数据库中的反应进行编码。自动编码器的潜在空间在生成地形图上可视化。在铃木反应占据的地图区域周围对新的潜在空间点进行采样,并对其进行解码以得到相应的反应。可以由专家对这些反应进行严格分析,清除不相关的官能团,最终进行实验尝试,从而扩大了流行合成途径的合成目的。