Suppr超能文献

元定量构效关系(Meta-QSAR):元学习在药物设计与发现中的大规模应用。

Meta-QSAR: a large-scale application of meta-learning to drug design and discovery.

作者信息

Olier Ivan, Sadawi Noureddin, Bickerton G Richard, Vanschoren Joaquin, Grosan Crina, Soldatova Larisa, King Ross D

机构信息

1Manchester Metropolitan University, Manchester, UK.

2University of Manchester, Manchester, UK.

出版信息

Mach Learn. 2018;107(1):285-311. doi: 10.1007/s10994-017-5685-x. Epub 2017 Dec 22.

Abstract

We investigate the learning of quantitative structure activity relationships (QSARs) as a case-study of meta-learning. This application area is of the highest societal importance, as it is a key step in the development of new medicines. The standard QSAR learning problem is: given a target (usually a protein) and a set of chemical compounds (small molecules) with associated bioactivities (e.g. inhibition of the target), learn a predictive mapping from molecular representation to activity. Although almost every type of machine learning method has been applied to QSAR learning there is no agreed single best way of learning QSARs, and therefore the problem area is well-suited to meta-learning. We first carried out the most comprehensive ever comparison of machine learning methods for QSAR learning: 18 regression methods, 3 molecular representations, applied to more than 2700 QSAR problems. (These results have been made publicly available on OpenML and represent a valuable resource for testing novel meta-learning methods.) We then investigated the utility of algorithm selection for QSAR problems. We found that this meta-learning approach outperformed the best individual QSAR learning method (random forests using a molecular fingerprint representation) by up to 13%, on average. We conclude that meta-learning outperforms base-learning methods for QSAR learning, and as this investigation is one of the most extensive ever comparisons of base and meta-learning methods ever made, it provides evidence for the general effectiveness of meta-learning over base-learning.

摘要

我们研究定量构效关系(QSAR)的学习,以此作为元学习的一个案例研究。这个应用领域具有极其重要的社会意义,因为它是新药研发的关键一步。标准的QSAR学习问题是:给定一个靶点(通常是一种蛋白质)和一组具有相关生物活性(例如对靶点的抑制作用)的化合物(小分子),学习从分子表征到活性的预测映射。尽管几乎每种机器学习方法都已应用于QSAR学习,但对于学习QSAR并没有一种公认的最佳单一方法,因此这个问题领域非常适合元学习。我们首先对用于QSAR学习的机器学习方法进行了有史以来最全面的比较:18种回归方法、3种分子表征,应用于2700多个QSAR问题。(这些结果已在OpenML上公开,是测试新型元学习方法的宝贵资源。)然后我们研究了算法选择对QSAR问题的效用。我们发现,这种元学习方法平均比最佳的单个QSAR学习方法(使用分子指纹表征的随机森林)性能高出13%。我们得出结论,对于QSAR学习,元学习优于基础学习方法,并且由于这项研究是有史以来对基础学习和元学习方法进行的最广泛比较之一,它为元学习相对于基础学习的总体有效性提供了证据。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a5d3/6956898/171e31d1fd42/10994_2017_5685_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验