Suppr超能文献

深度学习在药物靶点预测中的工业规模应用与评估

Industry-scale application and evaluation of deep learning for drug target prediction.

作者信息

Sturm Noé, Mayr Andreas, Le Van Thanh, Chupakhin Vladimir, Ceulemans Hugo, Wegner Joerg, Golib-Dzib Jose-Felipe, Jeliazkova Nina, Vandriessche Yves, Böhm Stanislav, Cima Vojtech, Martinovic Jan, Greene Nigel, Vander Aa Tom, Ashby Thomas J, Hochreiter Sepp, Engkvist Ola, Klambauer Günter, Chen Hongming

机构信息

Clinical Pharmacology and Safety Science, R&D BioPharmaceuticals, AstraZeneca, Pepparedsleden 1, 43183, Mölndal, Sweden.

LIT AI Lab & Institute for Machine Learning, Johannes Kepler University Linz, Altenberger Str. 69, 4040, Linz, Austria.

出版信息

J Cheminform. 2020 Apr 19;12(1):26. doi: 10.1186/s13321-020-00428-5.

Abstract

Artificial intelligence (AI) is undergoing a revolution thanks to the breakthroughs of machine learning algorithms in computer vision, speech recognition, natural language processing and generative modelling. Recent works on publicly available pharmaceutical data showed that AI methods are highly promising for Drug Target prediction. However, the quality of public data might be different than that of industry data due to different labs reporting measurements, different measurement techniques, fewer samples and less diverse and specialized assays. As part of a European funded project (ExCAPE), that brought together expertise from pharmaceutical industry, machine learning, and high-performance computing, we investigated how well machine learning models obtained from public data can be transferred to internal pharmaceutical industry data. Our results show that machine learning models trained on public data can indeed maintain their predictive power to a large degree when applied to industry data. Moreover, we observed that deep learning derived machine learning models outperformed comparable models, which were trained by other machine learning algorithms, when applied to internal pharmaceutical company datasets. To our knowledge, this is the first large-scale study evaluating the potential of machine learning and especially deep learning directly at the level of industry-scale settings and moreover investigating the transferability of publicly learned target prediction models towards industrial bioactivity prediction pipelines.

摘要

由于机器学习算法在计算机视觉、语音识别、自然语言处理和生成建模方面取得的突破,人工智能(AI)正在经历一场革命。最近针对公开可用药物数据开展的研究表明,人工智能方法在药物靶点预测方面极具前景。然而,由于不同实验室报告测量结果、测量技术不同、样本较少以及检测方法缺乏多样性和专业性,公共数据的质量可能与行业数据有所不同。作为一个由欧洲资助的项目(ExCAPE)的一部分,该项目汇聚了制药行业、机器学习和高性能计算领域的专业知识,我们研究了从公共数据中获得的机器学习模型能够在多大程度上转移到制药行业内部数据。我们的结果表明,在公共数据上训练的机器学习模型在应用于行业数据时,确实能够在很大程度上保持其预测能力。此外,我们观察到,当应用于制药公司内部数据集时,深度学习衍生的机器学习模型优于通过其他机器学习算法训练的可比模型。据我们所知,这是第一项在行业规模层面直接评估机器学习尤其是深度学习潜力的大规模研究,并且还研究了公开学习的靶点预测模型向工业生物活性预测流程的可转移性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6cfc/7169028/65a0df376e20/13321_2020_428_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验