Fiocruz, 4365 Avenida Brasil, Rio de Janeiro, RJ 21040 900, Brazil.
IBM Watson, 1101 Kitchawan Rd., Yorktown Heights, New York 10598, United States.
J Chem Inf Model. 2016 Dec 27;56(12):2495-2506. doi: 10.1021/acs.jcim.6b00355. Epub 2016 Nov 29.
In this work, we propose a deep learning approach to improve docking-based virtual screening. The deep neural network that is introduced, DeepVS, uses the output of a docking program and learns how to extract relevant features from basic data such as atom and residues types obtained from protein-ligand complexes. Our approach introduces the use of atom and amino acid embeddings and implements an effective way of creating distributed vector representations of protein-ligand complexes by modeling the compound as a set of atom contexts that is further processed by a convolutional layer. One of the main advantages of the proposed method is that it does not require feature engineering. We evaluate DeepVS on the Directory of Useful Decoys (DUD), using the output of two docking programs: Autodock Vina1.1.2 and Dock 6.6. Using a strict evaluation with leave-one-out cross-validation, DeepVS outperforms the docking programs, with regard to both AUC ROC and enrichment factor. Moreover, using the output of Autodock Vina1.1.2, DeepVS achieves an AUC ROC of 0.81, which, to the best of our knowledge, is the best AUC reported so far for virtual screening using the 40 receptors from the DUD.
在这项工作中,我们提出了一种基于深度学习的方法来改进对接虚拟筛选。引入的深度神经网络 DeepVS 使用对接程序的输出,并学习如何从从蛋白配体复合物中获得的基本数据(如原子和残基类型)中提取相关特征。我们的方法引入了原子和氨基酸嵌入的使用,并通过将化合物建模为一组原子上下文,进一步通过卷积层处理,实现了一种创建蛋白配体复合物分布式向量表示的有效方法。该方法的主要优点之一是它不需要特征工程。我们在有用诱饵目录(DUD)上评估 DeepVS,使用两个对接程序的输出:Autodock Vina1.1.2 和 Dock 6.6。使用严格的留一交叉验证评估,DeepVS 在 AUC ROC 和富集因子方面均优于对接程序。此外,使用 Autodock Vina1.1.2 的输出,DeepVS 实现了 0.81 的 AUC ROC,据我们所知,这是迄今为止使用 DUD 中的 40 个受体进行虚拟筛选的最佳 AUC。