使用机器学习方法预测实验性高通量筛选数据。

Using machine learning methods to predict experimental high-throughput screening data.

作者信息

Mballo Chérif, Makarenkov Vladimir

机构信息

Département d'informatique, Université du Québec à Montréal, Montreal, Québec, Canada.

出版信息

Comb Chem High Throughput Screen. 2010 Jun;13(5):430-41. doi: 10.2174/138620710791292958.

DOI:10.2174/138620710791292958

PMID:20236062

Abstract

High-throughput screening (HTS) remains a very costly process notwithstanding many recent technological advances in the field of biotechnology. In this study we consider the application of machine learning methods for predicting experimental HTS measurements. Such a virtual HTS analysis can be based on the results of real HTS campaigns carried out with similar compounds libraries and similar drug targets. In this way, we analyzed Test assay from McMaster University Data Mining and Docking Competition using binary decision trees, neural networks, support vector machines (SVM), linear discriminant analysis, k-nearest neighbors and partial least squares. First, we studied separately the sets of molecular and atomic descriptors in order to establish which of them provides a better prediction. Then, the comparison of the six considered machine learning methods was made in terms of false positives and false negatives, method's sensitivity and enrichment factor. Finally, a variable selection procedure allowing one to improve the method's sensitivity was implemented and applied in the framework of polynomial SVM.

摘要

尽管生物技术领域最近有许多技术进步，但高通量筛选（HTS）仍然是一个非常昂贵的过程。在本研究中，我们考虑应用机器学习方法来预测高通量筛选实验测量结果。这种虚拟高通量筛选分析可以基于使用类似化合物库和类似药物靶点进行的实际高通量筛选活动的结果。通过这种方式，我们使用二叉决策树、神经网络、支持向量机（SVM）、线性判别分析、k近邻和偏最小二乘法分析了麦克马斯特大学数据挖掘与对接竞赛中的测试分析。首先，我们分别研究了分子和原子描述符集，以确定其中哪一个能提供更好的预测。然后，根据假阳性和假阴性、方法的灵敏度和富集因子对六种机器学习方法进行了比较。最后，实施了一种变量选择程序，以提高方法的灵敏度，并将其应用于多项式支持向量机框架中。

相似文献

Using machine learning methods to predict experimental high-throughput screening data.

Comb Chem High Throughput Screen. 2010 Jun;13(5):430-41. doi: 10.2174/138620710791292958.

GPU accelerated support vector machines for mining high-throughput screening data.

J Chem Inf Model. 2009 Dec;49(12):2718-25. doi: 10.1021/ci900337f.

Seminal quality prediction using data mining methods.

Technol Health Care. 2014;22(4):531-45. doi: 10.3233/THC-140816.

Comparative study of machine-learning and chemometric tools for analysis of in-vivo high-throughput screening data.

J Chem Inf Model. 2008 Aug;48(8):1663-8. doi: 10.1021/ci800142d. Epub 2008 Aug 6.

Binary classification of a large collection of environmental chemicals from estrogen receptor assays by quantitative structure-activity relationship and machine learning methods.

J Chem Inf Model. 2013 Dec 23;53(12):3244-61. doi: 10.1021/ci400527b. Epub 2013 Dec 11.

Application of Bioactivity Profile-Based Fingerprints for Building Machine Learning Models.

J Chem Inf Model. 2019 Mar 25;59(3):962-972. doi: 10.1021/acs.jcim.8b00550. Epub 2018 Nov 29.

Support vector machine regression (SVR/LS-SVM)--an alternative to neural networks (ANN) for analytical chemistry? Comparison of nonlinear methods on near infrared (NIR) spectroscopy data.

Analyst. 2011 Apr 21;136(8):1703-12. doi: 10.1039/c0an00387e. Epub 2011 Feb 25.

Virtual screening for cytochromes p450: successes of machine learning filters.

Comb Chem High Throughput Screen. 2009 May;12(4):369-82. doi: 10.2174/138620709788167935.

Potency-directed similarity searching using support vector machines.

Chem Biol Drug Des. 2011 Jan;77(1):30-8. doi: 10.1111/j.1747-0285.2010.01059.x. Epub 2010 Nov 29.

Target specific compound identification using a support vector machine.

Comb Chem High Throughput Screen. 2007 Mar;10(3):189-96. doi: 10.2174/138620707780126705.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

使用机器学习方法预测实验性高通量筛选数据。

Using machine learning methods to predict experimental high-throughput screening data.

作者信息

机构信息

出版信息

相似文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献