Suppr超能文献

使用机器学习方法预测实验性高通量筛选数据。

Using machine learning methods to predict experimental high-throughput screening data.

作者信息

Mballo Chérif, Makarenkov Vladimir

机构信息

Département d'informatique, Université du Québec à Montréal, Montreal, Québec, Canada.

出版信息

Comb Chem High Throughput Screen. 2010 Jun;13(5):430-41. doi: 10.2174/138620710791292958.

Abstract

High-throughput screening (HTS) remains a very costly process notwithstanding many recent technological advances in the field of biotechnology. In this study we consider the application of machine learning methods for predicting experimental HTS measurements. Such a virtual HTS analysis can be based on the results of real HTS campaigns carried out with similar compounds libraries and similar drug targets. In this way, we analyzed Test assay from McMaster University Data Mining and Docking Competition using binary decision trees, neural networks, support vector machines (SVM), linear discriminant analysis, k-nearest neighbors and partial least squares. First, we studied separately the sets of molecular and atomic descriptors in order to establish which of them provides a better prediction. Then, the comparison of the six considered machine learning methods was made in terms of false positives and false negatives, method's sensitivity and enrichment factor. Finally, a variable selection procedure allowing one to improve the method's sensitivity was implemented and applied in the framework of polynomial SVM.

摘要

尽管生物技术领域最近有许多技术进步,但高通量筛选(HTS)仍然是一个非常昂贵的过程。在本研究中,我们考虑应用机器学习方法来预测高通量筛选实验测量结果。这种虚拟高通量筛选分析可以基于使用类似化合物库和类似药物靶点进行的实际高通量筛选活动的结果。通过这种方式,我们使用二叉决策树、神经网络、支持向量机(SVM)、线性判别分析、k近邻和偏最小二乘法分析了麦克马斯特大学数据挖掘与对接竞赛中的测试分析。首先,我们分别研究了分子和原子描述符集,以确定其中哪一个能提供更好的预测。然后,根据假阳性和假阴性、方法的灵敏度和富集因子对六种机器学习方法进行了比较。最后,实施了一种变量选择程序,以提高方法的灵敏度,并将其应用于多项式支持向量机框架中。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验