监督分类器的系统比较

A systematic comparison of supervised classifiers.

作者信息

Amancio Diego Raphael, Comin Cesar Henrique, Casanova Dalcimar, Travieso Gonzalo, Bruno Odemir Martinez, Rodrigues Francisco Aparecido, Costa Luciano da Fontoura

机构信息

Institute of Mathematics and Computer Science, University of São Paulo, São Carlos, São Paulo, Brazil.

São Carlos Institute of Physics, University of São Paulo, São Carlos, São Paulo, Brazil.

出版信息

PLoS One. 2014 Apr 24;9(4):e94137. doi: 10.1371/journal.pone.0094137. eCollection 2014.

DOI:10.1371/journal.pone.0094137

PMID:24763312

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3998948/

Abstract

Pattern recognition has been employed in a myriad of industrial, commercial and academic applications. Many techniques have been devised to tackle such a diversity of applications. Despite the long tradition of pattern recognition research, there is no technique that yields the best classification in all scenarios. Therefore, as many techniques as possible should be considered in high accuracy applications. Typical related works either focus on the performance of a given algorithm or compare various classification methods. In many occasions, however, researchers who are not experts in the field of machine learning have to deal with practical classification tasks without an in-depth knowledge about the underlying parameters. Actually, the adequate choice of classifiers and parameters in such practical circumstances constitutes a long-standing problem and is one of the subjects of the current paper. We carried out a performance study of nine well-known classifiers implemented in the Weka framework and compared the influence of the parameter configurations on the accuracy. The default configuration of parameters in Weka was found to provide near optimal performance for most cases, not including methods such as the support vector machine (SVM). In addition, the k-nearest neighbor method frequently allowed the best accuracy. In certain conditions, it was possible to improve the quality of SVM by more than 20% with respect to their default parameter configuration.

摘要

模式识别已被应用于无数的工业、商业和学术领域。人们已经设计出许多技术来处理如此多样的应用场景。尽管模式识别研究有着悠久的传统，但没有一种技术能在所有情况下都产生最佳分类效果。因此，在高精度应用中应考虑尽可能多的技术。典型的相关工作要么侧重于给定算法的性能，要么比较各种分类方法。然而，在许多情况下，机器学习领域的非专家研究人员必须在对底层参数缺乏深入了解的情况下处理实际的分类任务。实际上，在这种实际情况下，分类器和参数的恰当选择是一个长期存在的问题，也是本文的主题之一。我们对在Weka框架中实现的九个著名分类器进行了性能研究，并比较了参数配置对准确性的影响。结果发现，Weka中的参数默认配置在大多数情况下能提供接近最优的性能，但不包括支持向量机（SVM）等方法。此外，k近邻方法常常能获得最佳准确性。在某些条件下，相对于其默认参数配置，SVM的质量有可能提高20%以上。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7a47/3998948/e6811b7ddb9c/pone.0094137.g001.jpg

相似文献

A systematic comparison of supervised classifiers.

PLoS One. 2014 Apr 24;9(4):e94137. doi: 10.1371/journal.pone.0094137. eCollection 2014.

Computer-assisted lip diagnosis on Traditional Chinese Medicine using multi-class support vector machines.

BMC Complement Altern Med. 2012 Aug 16;12:127. doi: 10.1186/1472-6882-12-127.

Classification of THz pulse signals using two-dimensional cross-correlation feature extraction and non-linear classifiers.

Comput Methods Programs Biomed. 2016 Apr;127:64-82. doi: 10.1016/j.cmpb.2016.01.017. Epub 2016 Feb 1.

Reviewing ensemble classification methods in breast cancer.

Comput Methods Programs Biomed. 2019 Aug;177:89-112. doi: 10.1016/j.cmpb.2019.05.019. Epub 2019 May 20.

Prediction of heart disease and classifiers' sensitivity analysis.

BMC Bioinformatics. 2020 Jul 2;21(1):278. doi: 10.1186/s12859-020-03626-y.

Analysis of structural brain MRI and multi-parameter classification for Alzheimer's disease.

Biomed Tech (Berl). 2018 Jul 26;63(4):427-437. doi: 10.1515/bmt-2016-0239.

An improved method of early diagnosis of smoking-induced respiratory changes using machine learning algorithms.

Comput Methods Programs Biomed. 2013 Dec;112(3):441-54. doi: 10.1016/j.cmpb.2013.08.004. Epub 2013 Aug 17.

Machine Learning Algorithms for Activity-Intensity Recognition Using Accelerometer Data.

Sensors (Basel). 2021 Feb 9;21(4):1214. doi: 10.3390/s21041214.

Cardiac magnetic resonance image-based classification of the risk of arrhythmias in post-myocardial infarction patients.

Artif Intell Med. 2015 Jul;64(3):205-15. doi: 10.1016/j.artmed.2015.06.001. Epub 2015 Jul 4.

Representative Vector Machines: A Unified Framework for Classical Classifiers.

IEEE Trans Cybern. 2016 Aug;46(8):1877-88. doi: 10.1109/TCYB.2015.2457234. Epub 2015 Aug 13.

引用本文的文献

Leveraging deep learning for the detection of socially desirable tendencies in personnel selection: A proof-of-concept.

PLoS One. 2025 Aug 5;20(8):e0329205. doi: 10.1371/journal.pone.0329205. eCollection 2025.

Exploring the power of data mining for uncovering traditional medicinal plant knowledge: A case study in Shahrbabak, Iran.

PLoS One. 2024 Jun 10;19(6):e0303229. doi: 10.1371/journal.pone.0303229. eCollection 2024.

Using full-text content to characterize and identify best seller books: A study of early 20th-century literature.

PLoS One. 2024 Apr 26;19(4):e0302070. doi: 10.1371/journal.pone.0302070. eCollection 2024.

Accurate staging of chick embryonic tissues via deep learning of salient features.

Development. 2023 Nov 15;150(22). doi: 10.1242/dev.202068. Epub 2023 Nov 16.

Framework for multi-criteria assessment of classification models for the purposes of credit scoring.

J Big Data. 2023;10(1):94. doi: 10.1186/s40537-023-00768-7. Epub 2023 Jun 2.

Evaluation of a decided sample size in machine learning applications.

BMC Bioinformatics. 2023 Feb 14;24(1):48. doi: 10.1186/s12859-023-05156-9.

A Novel Computer Vision Model for Medicinal Plant Identification Using Log-Gabor Filters and Deep Learning Algorithms.

Comput Intell Neurosci. 2022 Sep 27;2022:1189509. doi: 10.1155/2022/1189509. eCollection 2022.

Methylphenidate Differentially Affects Intrinsic Functional Connectivity of the Salience Network in Adult ADHD Treatment Responders and Non-Responders.

Biology (Basel). 2022 Sep 6;11(9):1320. doi: 10.3390/biology11091320.

Label-Free Imaging to Track Reprogramming of Human Somatic Cells.

GEN Biotechnol. 2022 Apr 1;1(2):176-191. doi: 10.1089/genbio.2022.0001. Epub 2022 Apr 20.

Comparison of Fine-Tuned Deep Convolutional Neural Networks for the Automated Classification of Lung Cancer Cytology Images with Integration of Additional Classifiers.

Asian Pac J Cancer Prev. 2022 Apr 1;23(4):1315-1324. doi: 10.31557/APJCP.2022.23.4.1315.

本文引用的文献

Automated, high accuracy classification of Parkinsonian disorders: a pattern recognition approach.

PLoS One. 2013 Jul 15;8(7):e69237. doi: 10.1371/journal.pone.0069237. Print 2013.

On the parameter optimization of Support Vector Machines for binary classification.

J Integr Bioinform. 2012 Jul 24;9(3):201. doi: 10.2390/biecoll-jib-2012-201.

A comparison of classification methods for predicting Chronic Fatigue Syndrome based on genetic data.

J Transl Med. 2009 Sep 22;7:81. doi: 10.1186/1479-5876-7-81.

The validation and assessment of machine learning: a game of prediction from high-dimensional data.

PLoS One. 2009 Aug 4;4(8):e6287. doi: 10.1371/journal.pone.0006287.

Supervised pattern recognition in food analysis.

J Chromatogr A. 2007 Jul 27;1158(1-2):196-214. doi: 10.1016/j.chroma.2007.05.024. Epub 2007 May 13.

KPCA plus LDA: a complete kernel Fisher discriminant framework for feature extraction and recognition.

IEEE Trans Pattern Anal Mach Intell. 2005 Feb;27(2):230-44. doi: 10.1109/TPAMI.2005.33.

Automatic identification of subcellular phenotypes on human cell arrays.

Genome Res. 2004 Jun;14(6):1130-6. doi: 10.1101/gr.2383804.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

监督分类器的系统比较

A systematic comparison of supervised classifiers.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献