从替代风险最小化器到贝叶斯最优分类器的收敛速度。

On the Rates of Convergence From Surrogate Risk Minimizers to the Bayes Optimal Classifier.

出版信息

IEEE Trans Neural Netw Learn Syst. 2022 Oct;33(10):5766-5774. doi: 10.1109/TNNLS.2021.3071370. Epub 2022 Oct 5.

DOI:10.1109/TNNLS.2021.3071370

PMID:33882001

Abstract

In classification, the use of 0-1 loss is preferable since the minimizer of 0-1 risk leads to the Bayes optimal classifier. However, due to the nonconvexity of 0-1 loss, this optimization problem is NP-hard. Therefore, many convex surrogate loss functions have been adopted. Previous works have shown that if a Bayes-risk consistent loss function is used as a surrogate, the minimizer of the empirical surrogate risk can achieve the Bayes optimal classifier as the sample size tends to infinity. Nevertheless, the comparison of convergence rates of minimizers of different empirical surrogate risks to the Bayes optimal classifier has rarely been studied. Which characterization of the surrogate loss determines its convergence rate to the Bayes optimal classifier? Can we modify the loss function to achieve a faster convergence rate? In this article, we study the convergence rates of empirical surrogate minimizers to the Bayes optimal classifier. Specifically, we introduce the notions of consistency intensity and conductivity to characterize a surrogate loss function and exploit this notion to obtain the rate of convergence from an empirical surrogate risk minimizer to the Bayes optimal classifier, enabling fair comparisons of the excess risks of different surrogate risk minimizers. The main result of this article has practical implications including: 1) showing that hinge loss (SVM) is superior to logistic loss (Logistic regression) and exponential loss (Adaboost) in the sense that its empirical minimizer converges faster to the Bayes optimal classifier and 2) guiding the design of new loss functions to speed up the convergence rate to the Bayes optimal classifier with a data-dependent loss correction method inspired by our theorems.

摘要

在分类中，使用 0-1 损失是可取的，因为 0-1 风险的最小化导致贝叶斯最优分类器。然而，由于 0-1 损失的非凸性，这个优化问题是 NP 难的。因此，许多凸的替代损失函数已经被采用。以前的工作表明，如果使用贝叶斯风险一致的损失函数作为替代，经验替代风险的最小化可以在样本量趋于无穷大时达到贝叶斯最优分类器。然而，很少有研究比较不同经验替代风险最小化器对贝叶斯最优分类器的收敛速度。替代损失的哪种特征决定了它对贝叶斯最优分类器的收敛速度？我们能否修改损失函数以实现更快的收敛速度？在本文中，我们研究了经验替代最小化器对贝叶斯最优分类器的收敛速度。具体来说，我们引入了一致性强度和电导率的概念来刻画替代损失函数，并利用这个概念来获得从经验替代风险最小化器到贝叶斯最优分类器的收敛速度，从而可以公平地比较不同替代风险最小化器的超额风险。本文的主要结果具有实际意义，包括：1）表明铰链损失（支持向量机）优于逻辑损失（逻辑回归）和指数损失（Adaboost），因为其经验最小化器更快地收敛到贝叶斯最优分类器；2）指导新损失函数的设计，通过受我们定理启发的数据相关损失校正方法来加快收敛到贝叶斯最优分类器的速度。

相似文献

On the Rates of Convergence From Surrogate Risk Minimizers to the Bayes Optimal Classifier.从替代风险最小化器到贝叶斯最优分类器的收敛速度。

IEEE Trans Neural Netw Learn Syst. 2022 Oct;33(10):5766-5774. doi: 10.1109/TNNLS.2021.3071370. Epub 2022 Oct 5.

Discrete Box-Constrained Minimax Classifier for Uncertain and Imbalanced Class Proportions.不确定和不平衡类比例的离散盒约束最小最大化分类器。

IEEE Trans Pattern Anal Mach Intell. 2022 Jun;44(6):2923-2937. doi: 10.1109/TPAMI.2020.3046439. Epub 2022 May 5.

Bayes Consistency vs. -Consistency: The Interplay between Surrogate Loss Functions and the Scoring Function Class.贝叶斯一致性与ε-一致性：替代损失函数与评分函数类之间的相互作用

Adv Neural Inf Process Syst. 2020 Dec;33:16927-16936.

Neural network for a class of sparse optimization with L-regularization.具有 L-正则化的一类稀疏优化的神经网络。

Neural Netw. 2022 Jul;151:211-221. doi: 10.1016/j.neunet.2022.03.033. Epub 2022 Apr 5.

Improved design and analysis of practical minimizers.实用极小化器的改进设计与分析。

Bioinformatics. 2020 Jul 1;36(Suppl_1):i119-i127. doi: 10.1093/bioinformatics/btaa472.

ECG Signal Classification Using Various Machine Learning Techniques.基于各种机器学习技术的心电图信号分类。

J Med Syst. 2018 Oct 18;42(12):241. doi: 10.1007/s10916-018-1083-6.

Evaluating Classification Model Against Bayes Error Rate.评估分类模型的贝叶斯错误率。

IEEE Trans Pattern Anal Mach Intell. 2023 Aug;45(8):9639-9653. doi: 10.1109/TPAMI.2023.3240194. Epub 2023 Jun 30.

A calibrated multiclass extension of AdaBoost.一种经过校准的AdaBoost多类扩展。

Stat Appl Genet Mol Biol. 2011 Nov 20;10(1):/j/sagmb.2011.10.issue-1/1544-6115.1731/1544-6115.1731.xml. doi: 10.2202/1544-6115.1731.

Differentiable Learning of Sequence-Specific Minimizer Schemes with DeepMinimizer.使用 DeepMinimizer 进行序列特异性最小化方案的可微学习。

J Comput Biol. 2022 Dec;29(12):1288-1304. doi: 10.1089/cmb.2022.0275. Epub 2022 Sep 12.

RKHS Bayes discriminant: a subspace constrained nonlinear feature projection for signal detection.再生核希尔伯特空间贝叶斯判别：一种用于信号检测的子空间约束非线性特征投影

IEEE Trans Neural Netw. 2009 Jul;20(7):1195-203. doi: 10.1109/TNN.2009.2021473. Epub 2009 Jun 2.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

从替代风险最小化器到贝叶斯最优分类器的收敛速度。

On the Rates of Convergence From Surrogate Risk Minimizers to the Bayes Optimal Classifier.

出版信息

相似文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献