硬分类还是软分类？大间隔统一机器。

Hard or Soft Classification? Large-margin Unified Machines.

作者信息

Liu Yufeng, Zhang Hao Helen, Wu Yichao

机构信息

Department of Statistics and Operations Research, Carolina Center for Genome Sciences, University of North Carolina, Chapel Hill, NC 27599.

出版信息

J Am Stat Assoc. 2011 Mar 1;106(493):166-177. doi: 10.1198/jasa.2011.tm10319.

DOI:10.1198/jasa.2011.tm10319

PMID:22162896

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3233196/

Abstract

Margin-based classifiers have been popular in both machine learning and statistics for classification problems. Among numerous classifiers, some are hard classifiers while some are soft ones. Soft classifiers explicitly estimate the class conditional probabilities and then perform classification based on estimated probabilities. In contrast, hard classifiers directly target on the classification decision boundary without producing the probability estimation. These two types of classifiers are based on different philosophies and each has its own merits. In this paper, we propose a novel family of large-margin classifiers, namely large-margin unified machines (LUMs), which covers a broad range of margin-based classifiers including both hard and soft ones. By offering a natural bridge from soft to hard classification, the LUM provides a unified algorithm to fit various classifiers and hence a convenient platform to compare hard and soft classification. Both theoretical consistency and numerical performance of LUMs are explored. Our numerical study sheds some light on the choice between hard and soft classifiers in various classification problems.

摘要

基于间隔的分类器在机器学习和统计学中对于分类问题都很流行。在众多分类器中，有些是硬分类器，而有些是软分类器。软分类器明确估计类条件概率，然后基于估计的概率进行分类。相比之下，硬分类器直接针对分类决策边界，而不产生概率估计。这两种类型的分类器基于不同的理念，且各有优点。在本文中，我们提出了一种新颖的大间隔分类器家族，即大间隔统一机器（LUMs），它涵盖了广泛的基于间隔的分类器，包括硬分类器和软分类器。通过提供从软分类到硬分类的自然桥梁，LUM提供了一种统一的算法来拟合各种分类器，从而为比较硬分类和软分类提供了一个方便的平台。我们探索了LUMs的理论一致性和数值性能。我们的数值研究为在各种分类问题中硬分类器和软分类器之间的选择提供了一些启示。

相似文献

Hard or Soft Classification? Large-margin Unified Machines.

J Am Stat Assoc. 2011 Mar 1;106(493):166-177. doi: 10.1198/jasa.2011.tm10319.

Multicategory Large-Margin Unified Machines.

J Mach Learn Res. 2013 May 1;14:1349-1386.

Multicategory angle-based large-margin classification.

Biometrika. 2014 Sep;101(3):625-640. doi: 10.1093/biomet/asu017. Epub 2014 Jul 23.

Non-crossing large-margin probability estimation and its application to robust SVM via preconditioning.

Stat Methodol. 2011 Jan;8(1):56-67. doi: 10.1016/j.stamet.2009.05.004.

Probabilistic classification vector machines.

IEEE Trans Neural Netw. 2009 Jun;20(6):901-14. doi: 10.1109/TNN.2009.2014161. Epub 2009 Apr 24.

Adaptively weighted large-margin angle-based classifiers.

J Multivar Anal. 2018 Jul;166:282-299. doi: 10.1016/j.jmva.2018.03.004. Epub 2018 Mar 15.

Prototype classification: insights from machine learning.

Neural Comput. 2009 Jan;21(1):272-300. doi: 10.1162/neco.2008.01-07-443.

Maximum margin Bayesian network classifiers.

IEEE Trans Pattern Anal Mach Intell. 2012 Mar;34(3):521-32. doi: 10.1109/TPAMI.2011.149.

Soft-margin classification of object manifolds.

Phys Rev E. 2022 Aug;106(2-1):024126. doi: 10.1103/PhysRevE.106.024126.

Multicategory Composite Least Squares Classifiers.

Stat Anal Data Min. 2010 Aug;3(4):272-286. doi: 10.1002/sam.10081.

引用本文的文献

Soft classification and regression analysis of audiometric phenotypes of age-related hearing loss.

Biometrics. 2024 Jan 29;80(1). doi: 10.1093/biomtc/ujae013.

Multiway sparse distance weighted discrimination.

J Comput Graph Stat. 2023;32(2):730-743. doi: 10.1080/10618600.2022.2099404. Epub 2022 Aug 30.

Bayesian Distance Weighted Discrimination.

J Comput Graph Stat. 2022;31(4):1177-1188. doi: 10.1080/10618600.2022.2069778. Epub 2022 May 26.

Chemometric-Guided Approaches for Profiling and Authenticating Botanical Materials.

Front Nutr. 2021 Nov 26;8:780228. doi: 10.3389/fnut.2021.780228. eCollection 2021.

A study design for statistical learning technique to predict radiological progression with an application of idiopathic pulmonary fibrosis using chest CT images.

Contemp Clin Trials. 2021 May;104:106333. doi: 10.1016/j.cct.2021.106333. Epub 2021 Mar 19.

Multicategory Outcome Weighted Margin-based Learning for Estimating Individualized Treatment Rules.

Stat Sin. 2020;30:1857-1879. doi: 10.5705/ss.202017.0527.

Sparse Multicategory Generalized Distance Weighted Discrimination in Ultra-High Dimensions.

Entropy (Basel). 2020 Nov 5;22(11):1257. doi: 10.3390/e22111257.

Convex Bidirectional Large Margin Classifiers.

Technometrics. 2019;61(2):176-186. doi: 10.1080/00401706.2018.1497544. Epub 2018 Sep 12.

Adaptively weighted large-margin angle-based classifiers.

J Multivar Anal. 2018 Jul;166:282-299. doi: 10.1016/j.jmva.2018.03.004. Epub 2018 Mar 15.

Robust Multicategory Support Vector Machines using Difference Convex Algorithm.

Math Program. 2018 May;169(1):277-305. Epub 2017 Nov 29.

本文引用的文献

Weighted Distance Weighted Discrimination and Its Asymptotic Properties.

J Am Stat Assoc. 2010 Mar 1;105(489):401-414. doi: 10.1198/jasa.2010.tm08487.

An overview of statistical learning theory.

IEEE Trans Neural Netw. 1999;10(5):988-99. doi: 10.1109/72.788640.

Soft and hard classification by reproducing kernel Hilbert space methods.

Proc Natl Acad Sci U S A. 2002 Dec 24;99(26):16524-30. doi: 10.1073/pnas.242574899. Epub 2002 Dec 11.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

硬分类还是软分类？大间隔统一机器。

Hard or Soft Classification? Large-margin Unified Machines.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献