Suppr超能文献

监督分类器的系统比较

A systematic comparison of supervised classifiers.

作者信息

Amancio Diego Raphael, Comin Cesar Henrique, Casanova Dalcimar, Travieso Gonzalo, Bruno Odemir Martinez, Rodrigues Francisco Aparecido, Costa Luciano da Fontoura

机构信息

Institute of Mathematics and Computer Science, University of São Paulo, São Carlos, São Paulo, Brazil.

São Carlos Institute of Physics, University of São Paulo, São Carlos, São Paulo, Brazil.

出版信息

PLoS One. 2014 Apr 24;9(4):e94137. doi: 10.1371/journal.pone.0094137. eCollection 2014.

Abstract

Pattern recognition has been employed in a myriad of industrial, commercial and academic applications. Many techniques have been devised to tackle such a diversity of applications. Despite the long tradition of pattern recognition research, there is no technique that yields the best classification in all scenarios. Therefore, as many techniques as possible should be considered in high accuracy applications. Typical related works either focus on the performance of a given algorithm or compare various classification methods. In many occasions, however, researchers who are not experts in the field of machine learning have to deal with practical classification tasks without an in-depth knowledge about the underlying parameters. Actually, the adequate choice of classifiers and parameters in such practical circumstances constitutes a long-standing problem and is one of the subjects of the current paper. We carried out a performance study of nine well-known classifiers implemented in the Weka framework and compared the influence of the parameter configurations on the accuracy. The default configuration of parameters in Weka was found to provide near optimal performance for most cases, not including methods such as the support vector machine (SVM). In addition, the k-nearest neighbor method frequently allowed the best accuracy. In certain conditions, it was possible to improve the quality of SVM by more than 20% with respect to their default parameter configuration.

摘要

模式识别已被应用于无数的工业、商业和学术领域。人们已经设计出许多技术来处理如此多样的应用场景。尽管模式识别研究有着悠久的传统,但没有一种技术能在所有情况下都产生最佳分类效果。因此,在高精度应用中应考虑尽可能多的技术。典型的相关工作要么侧重于给定算法的性能,要么比较各种分类方法。然而,在许多情况下,机器学习领域的非专家研究人员必须在对底层参数缺乏深入了解的情况下处理实际的分类任务。实际上,在这种实际情况下,分类器和参数的恰当选择是一个长期存在的问题,也是本文的主题之一。我们对在Weka框架中实现的九个著名分类器进行了性能研究,并比较了参数配置对准确性的影响。结果发现,Weka中的参数默认配置在大多数情况下能提供接近最优的性能,但不包括支持向量机(SVM)等方法。此外,k近邻方法常常能获得最佳准确性。在某些条件下,相对于其默认参数配置,SVM的质量有可能提高20%以上。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7a47/3998948/e6811b7ddb9c/pone.0094137.g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验