Suppr超能文献

合并样本对分类算法性能的影响:一项比较研究。

Effects of pooling samples on the performance of classification algorithms: a comparative study.

作者信息

Kusonmano Kanthida, Netzer Michael, Baumgartner Christian, Dehmer Matthias, Liedl Klaus R, Graber Armin

机构信息

Institute for Bioinformatics and Translational Research, UMIT, 6060 Hall in Tyrol, Austria.

出版信息

ScientificWorldJournal. 2012;2012:278352. doi: 10.1100/2012/278352. Epub 2012 Apr 30.

Abstract

A pooling design can be used as a powerful strategy to compensate for limited amounts of samples or high biological variation. In this paper, we perform a comparative study to model and quantify the effects of virtual pooling on the performance of the widely applied classifiers, support vector machines (SVMs), random forest (RF), k-nearest neighbors (k-NN), penalized logistic regression (PLR), and prediction analysis for microarrays (PAMs). We evaluate a variety of experimental designs using mock omics datasets with varying levels of pool sizes and considering effects from feature selection. Our results show that feature selection significantly improves classifier performance for non-pooled and pooled data. All investigated classifiers yield lower misclassification rates with smaller pool sizes. RF mainly outperforms other investigated algorithms, while accuracy levels are comparable among all the remaining ones. Guidelines are derived to identify an optimal pooling scheme for obtaining adequate predictive power and, hence, to motivate a study design that meets best experimental objectives and budgetary conditions, including time constraints.

摘要

合并设计可作为一种有效的策略,用于弥补样本量有限或生物变异较大的问题。在本文中,我们进行了一项比较研究,以建模和量化虚拟合并对广泛应用的分类器(支持向量机(SVM)、随机森林(RF)、k近邻(k-NN)、惩罚逻辑回归(PLR)和微阵列预测分析(PAM))性能的影响。我们使用具有不同合并大小水平的模拟组学数据集,并考虑特征选择的影响,评估了各种实验设计。我们的结果表明,特征选择显著提高了非合并数据和合并数据的分类器性能。所有研究的分类器在合并大小较小时误分类率较低。RF主要优于其他研究算法,而其余所有算法的准确率水平相当。我们得出了一些指导原则,以确定获得足够预测能力的最佳合并方案,从而推动一种符合最佳实验目标和预算条件(包括时间限制)的研究设计。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cbb7/3361225/4e84bf3281a6/TSWJ2012-278352.001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验