Altae-Tran Han, Ramsundar Bharath, Pappu Aneesh S, Pande Vijay
Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139-4307, United States.
Department of Computer Science and Department of Chemistry, Stanford University, Stanford, California 94305, United States.
ACS Cent Sci. 2017 Apr 26;3(4):283-293. doi: 10.1021/acscentsci.6b00367. Epub 2017 Apr 3.
Recent advances in machine learning have made significant contributions to drug discovery. Deep neural networks in particular have been demonstrated to provide significant boosts in predictive power when inferring the properties and activities of small-molecule compounds (Ma, J. et al. J. Chem. Inf.
2015, 55, 263-274). However, the applicability of these techniques has been limited by the requirement for large amounts of training data. In this work, we demonstrate how one-shot learning can be used to significantly lower the amounts of data required to make meaningful predictions in drug discovery applications. We introduce a new architecture, the iterative refinement long short-term memory, that, when combined with graph convolutional neural networks, significantly improves learning of meaningful distance metrics over small-molecules. We open source all models introduced in this work as part of DeepChem, an open-source framework for deep-learning in drug discovery (Ramsundar, B. deepchem.io. https://github.com/deepchem/deepchem, 2016).
机器学习的最新进展为药物发现做出了重大贡献。特别是深度神经网络已被证明在推断小分子化合物的性质和活性时能显著提高预测能力(Ma, J.等人,《化学信息与建模杂志》,2015年,55卷,263 - 274页)。然而,这些技术的适用性受到大量训练数据需求的限制。在这项工作中,我们展示了如何使用一次性学习来显著减少药物发现应用中进行有意义预测所需的数据量。我们引入了一种新架构,即迭代细化长短期记忆,当与图卷积神经网络结合时,能显著改善对小分子有意义距离度量的学习。我们将这项工作中引入的所有模型作为DeepChem的一部分开源,DeepChem是一个用于药物发现深度学习的开源框架(Ramsundar, B. deepchem.io. https://github.com/deepchem/deepchem, 2016)。