Department of Computer Science and Engineering, Manipal Institute of Technology, MAHE, Manipal, India.
Expert Opin Drug Discov. 2022 Oct;17(10):1071-1079. doi: 10.1080/17460441.2023.2134340. Epub 2022 Oct 17.
Deep learning approaches have become popular in recent years in de novo drug design. Generative models for molecule generation and optimization have shown promising results. Molecules trained on different chemical data could regenerate molecules that were similar to the query molecule, thus supporting lead optimization. Recurrent neural network-based generative models have demonstrated application in low-data drug discovery, fragment-based drug design and in lead optimization.
In this review, we have provided an overview of recurrent neural network models and their variants for molecule generation with recent examples. The input representation of molecules as SMILES and molecular graphs have been discussed. The evaluation benchmarks and metrics used in generative neural network models are also highlighted. For this, ScienceDirect, Web of Science, and Google Scholar databases were searched with the article's keywords and their combinations to retrieve the most relevant and up-to-date information.
The simplicity of SMILES notation makes it suitable for training a sequence-based model such as a recurrent neural network. However, models that could be trained on molecular graphs to generate molecular structures which could be synthesized could open new possibility for valid molecule generation and synthetic feasibility.
近年来,深度学习方法在从头药物设计中变得流行。用于分子生成和优化的生成模型已经显示出有希望的结果。在不同化学数据上训练的分子可以重新生成与查询分子相似的分子,从而支持先导优化。基于递归神经网络的生成模型已经在数据量少的药物发现、基于片段的药物设计和先导优化中得到了应用。
在这篇综述中,我们提供了用于分子生成的递归神经网络模型及其变体的概述,并提供了最近的示例。讨论了将分子作为 SMILES 和分子图的输入表示。还强调了生成神经网 络模型中使用的评估基准和指标。为此,我们在 ScienceDirect、Web of Science 和 Google Scholar 数据库中使用文章的关键词及其组合进行了搜索,以检索最相关和最新的信息。
SMILES 符号的简单性使其适合于训练基于序列的模型,例如递归神经网络。然而,能够在分子图上进行训练以生成可合成的分子结构的模型可能为有效分子生成和合成可行性开辟新的可能性。