Science Data Software, LLC , 14914 Bradwill Court, Rockville, Maryland 20850, United States.
Collaborations Pharmaceuticals, Inc. , 840 Main Campus Drive, Lab 3510, Raleigh, North Carolina 27606, United States.
Mol Pharm. 2017 Dec 4;14(12):4462-4475. doi: 10.1021/acs.molpharmaceut.7b00578. Epub 2017 Nov 13.
Machine learning methods have been applied to many data sets in pharmaceutical research for several decades. The relative ease and availability of fingerprint type molecular descriptors paired with Bayesian methods resulted in the widespread use of this approach for a diverse array of end points relevant to drug discovery. Deep learning is the latest machine learning algorithm attracting attention for many of pharmaceutical applications from docking to virtual screening. Deep learning is based on an artificial neural network with multiple hidden layers and has found considerable traction for many artificial intelligence applications. We have previously suggested the need for a comparison of different machine learning methods with deep learning across an array of varying data sets that is applicable to pharmaceutical research. End points relevant to pharmaceutical research include absorption, distribution, metabolism, excretion, and toxicity (ADME/Tox) properties, as well as activity against pathogens and drug discovery data sets. In this study, we have used data sets for solubility, probe-likeness, hERG, KCNQ1, bubonic plague, Chagas, tuberculosis, and malaria to compare different machine learning methods using FCFP6 fingerprints. These data sets represent whole cell screens, individual proteins, physicochemical properties as well as a data set with a complex end point. Our aim was to assess whether deep learning offered any improvement in testing when assessed using an array of metrics including AUC, F1 score, Cohen's kappa, Matthews correlation coefficient and others. Based on ranked normalized scores for the metrics or data sets Deep Neural Networks (DNN) ranked higher than SVM, which in turn was ranked higher than all the other machine learning methods. Visualizing these properties for training and test sets using radar type plots indicates when models are inferior or perhaps over trained. These results also suggest the need for assessing deep learning further using multiple metrics with much larger scale comparisons, prospective testing as well as assessment of different fingerprints and DNN architectures beyond those used.
几十年来,机器学习方法已应用于药物研究中的多个数据集。由于易于获取指纹类型分子描述符并结合贝叶斯方法,因此这种方法广泛应用于与药物发现相关的各种不同终点。深度学习是最新的机器学习算法,它吸引了人们的关注,可用于药物研发的许多应用,从对接筛选到虚拟筛选。深度学习基于具有多个隐藏层的人工神经网络,已在许多人工智能应用中得到广泛应用。我们之前曾建议,需要在各种不同的数据集之间比较不同的机器学习方法与深度学习,这些数据集适用于药物研究。与药物研究相关的终点包括吸收、分布、代谢、排泄和毒性(ADME/Tox)特性,以及对病原体的活性和药物发现数据集。在这项研究中,我们使用了溶解度、探针相似性、hERG、KCNQ1、腺鼠疫、恰加斯病、结核病和疟疾数据集,使用 FCFP6 指纹比较了不同的机器学习方法。这些数据集代表全细胞筛选、单个蛋白质、物理化学性质以及具有复杂终点的数据集。我们的目的是评估在使用包括 AUC、F1 分数、科恩氏 kappa、马修斯相关系数等多种指标评估时,深度学习是否在测试中提供了任何改进。基于排名归一化分数,对于这些指标或数据集,深度神经网络 (DNN) 的排名高于支持向量机 (SVM),而 SVM 又高于所有其他机器学习方法。使用雷达图类型的图表可视化这些训练集和测试集的属性表明模型何时较差或可能过度训练。这些结果还表明,需要使用多种指标进一步评估深度学习,同时进行更大规模的比较、前瞻性测试以及评估不同的指纹和 DNN 架构,而不仅仅是使用上述方法。