CSIRO Manufacturing, Clayton, 3168, Australia.
Monash Institute of Pharmaceutical Sciences, Monash University, Parkville, 3052, Australia.
Mol Inform. 2017 Jan;36(1-2). doi: 10.1002/minf.201600118. Epub 2016 Oct 26.
Neural networks have generated valuable Quantitative Structure-Activity/Property Relationships (QSAR/QSPR) models for a wide variety of small molecules and materials properties. They have grown in sophistication and many of their initial problems have been overcome by modern mathematical techniques. QSAR studies have almost always used so-called "shallow" neural networks in which there is a single hidden layer between the input and output layers. Recently, a new and potentially paradigm-shifting type of neural network based on Deep Learning has appeared. Deep learning methods have generated impressive improvements in image and voice recognition, and are now being applied to QSAR and QSAR modelling. This paper describes the differences in approach between deep and shallow neural networks, compares their abilities to predict the properties of test sets for 15 large drug data sets (the kaggle set), discusses the results in terms of the Universal Approximation theorem for neural networks, and describes how DNN may ameliorate or remove troublesome "activity cliffs" in QSAR data sets.
神经网络已经为各种小分子和材料性质生成了有价值的定量结构-活性/性质关系(QSAR/QSPR)模型。它们已经变得越来越复杂,并且通过现代数学技术克服了许多最初的问题。QSAR 研究几乎总是使用所谓的“浅层”神经网络,其中输入层和输出层之间只有一个隐藏层。最近,一种基于深度学习的新型、潜在的范式转变类型的神经网络已经出现。深度学习方法在图像和语音识别方面取得了令人印象深刻的改进,现在正在被应用于 QSAR 和 QSAR 建模。本文描述了深度学习和浅层神经网络在方法上的差异,比较了它们对 15 个大型药物数据集(kaggle 数据集)测试集性质进行预测的能力,根据神经网络的通用逼近定理讨论了结果,并描述了 DNN 如何改善或消除 QSAR 数据集中麻烦的“活性悬崖”。