Faculty of Physics, M.V. Lomonosov Moscow State University , Moscow, Russia.
Butlerov Institute of Chemistry, Kazan Federal University , Kazan, Russia.
Expert Opin Drug Discov. 2020 Jul;15(7):755-764. doi: 10.1080/17460441.2020.1745183. Epub 2020 Mar 31.
Deep discriminative and generative neural-network models are becoming an integral part of the modern approach to ligand-based novel drug discovery. The variety of different architectures of neural networks, the methods of their training, and the procedures of generating new molecules require expert knowledge to choose the most suitable approach.
Three different approaches to deep learning use in ligand-based drug discovery are considered: virtual screening, neural generative models, and mutation-based structure generation. Several architectures of neural networks for building either discriminative or generative models are considered in this paper, including deep multilayer neural networks, different kinds of convolutional neural networks, recurrent neural networks, and several types of autoencoders. Several kinds of learning frameworks are also considered, including adversarial learning and reinforcement learning. Different types of representations for generating molecules, including SMILES, graphs, and several alternative string representations are also considered.
Two kinds of problem should be solved in order to make the models built using deep neural networks, especially generative models, a valuable option in ligand-based drug discovery: the issue of interpretability and explainability of deep-learning models and the issue of synthetic accessibility of novel compounds designed by deep-learning algorithms.
深度判别和生成神经网络模型正成为基于配体的新型药物发现现代方法的一个组成部分。神经网络的各种不同架构、它们的训练方法以及生成新分子的过程都需要专业知识才能选择最合适的方法。
本文考虑了三种不同的深度学习方法在基于配体的药物发现中的应用:虚拟筛选、神经生成模型和基于突变的结构生成。本文考虑了用于构建判别或生成模型的几种神经网络架构,包括深度多层神经网络、各种卷积神经网络、递归神经网络和几种类型的自动编码器。本文还考虑了几种学习框架,包括对抗学习和强化学习。还考虑了用于生成分子的不同类型的表示形式,包括 SMILES、图和几种替代字符串表示形式。
为了使基于深度神经网络(尤其是生成模型)构建的模型成为基于配体的药物发现的一种有价值的选择,需要解决两个问题:深度学习模型的可解释性和可解释性问题,以及深度学习算法设计的新型化合物的合成可及性问题。