Centre of Biological Engineering, Campus Gualtar, University of Minho, 4710-057 Braga, Portugal.
J Chem Inf Model. 2021 Nov 22;61(11):5343-5361. doi: 10.1021/acs.jcim.0c01496. Epub 2021 Oct 26.
In the past few years, molecular design has increasingly been using generative models from the emergent field of Deep Learning, proposing novel compounds that are likely to possess desired properties or activities. molecular design finds applications in different fields ranging from drug discovery and materials sciences to biotechnology. A panoply of deep generative models, including architectures as Recurrent Neural Networks, Autoencoders, and Generative Adversarial Networks, can be trained on existing data sets and provide for the generation of novel compounds. Typically, the new compounds follow the same underlying statistical distributions of properties exhibited on the training data set Additionally, different optimization strategies, including transfer learning, Bayesian optimization, reinforcement learning, and conditional generation, can direct the generation process toward desired aims, regarding their biological activities, synthesis processes or chemical features. Given the recent emergence of these technologies and their relevance, this work presents a systematic and critical review on deep generative models and related optimization methods for targeted compound design, and their applications.
在过去的几年中,分子设计越来越多地利用深度学习这一新兴领域的生成模型,提出可能具有所需性质或活性的新型化合物。分子设计在从药物发现和材料科学到生物技术等不同领域都有应用。一系列深度生成模型,包括递归神经网络、自动编码器和生成对抗网络等架构,可以在现有数据集上进行训练,并提供新型化合物的生成。通常,新化合物遵循与训练数据集上显示的性质相同的基本统计分布。此外,不同的优化策略,包括迁移学习、贝叶斯优化、强化学习和条件生成,可以将生成过程引导到生物活性、合成过程或化学特征等期望目标。鉴于这些技术的最新出现及其相关性,本文对针对化合物设计的深度生成模型和相关优化方法及其应用进行了系统和批判性的综述。