Wu Yichao, Liu Yufeng
Department of Statistics, North Carolina State University, Raleigh, NC 27695,
Stat Methodol. 2011 Jan;8(1):56-67. doi: 10.1016/j.stamet.2009.05.004.
Many large-margin classifiers such as the Support Vector Machine (SVM) sidestep estimating conditional class probabilities and target the discovery of classification boundaries directly. However, estimation of conditional class probabilities can be useful in many applications. Wang, Shen, and Liu (2008) bridged the gap by providing an interval estimator of the conditional class probability via bracketing. The interval estimator was achieved by applying different weights to positive and negative classes and training the corresponding weighted large-margin classifiers. They propose to estimate the weighted large-margin classifiers individually. However, empirically the individually estimated classification boundaries may suffer from crossing each other even though, theoretically, they should not.In this work, we propose a technique to ensure non-crossing of the estimated classification boundaries. Furthermore, we take advantage of the estimated conditional class probabilities to precondition our training data. The standard SVM is then applied to the preconditioned training data to achieve robustness. Simulations and real data are used to illustrate their finite sample performance.
许多大间隔分类器,如支持向量机(SVM),回避估计条件类概率,而是直接致力于发现分类边界。然而,条件类概率的估计在许多应用中可能是有用的。Wang、Shen和Liu(2008)通过提供一种基于区间套的条件类概率区间估计器弥合了这一差距。该区间估计器是通过对正类和负类应用不同权重并训练相应的加权大间隔分类器来实现的。他们建议分别估计加权大间隔分类器。然而,从经验上看,即使理论上不应该,单独估计的分类边界也可能相互交叉。在这项工作中,我们提出了一种技术来确保估计的分类边界不交叉。此外,我们利用估计的条件类概率对训练数据进行预处理。然后将标准支持向量机应用于预处理后的训练数据以实现鲁棒性。通过模拟和实际数据来说明它们的有限样本性能。