Tzelepis Christos, Mezaris Vasileios, Patras Ioannis
IEEE Trans Pattern Anal Mach Intell. 2018 Dec;40(12):2948-2962. doi: 10.1109/TPAMI.2017.2772235. Epub 2017 Nov 10.
In this paper, we propose a maximum margin classifier that deals with uncertainty in data input. More specifically, we reformulate the SVM framework such that each training example can be modeled by a multi-dimensional Gaussian distribution described by its mean vector and its covariance matrix-the latter modeling the uncertainty. We address the classification problem and define a cost function that is the expected value of the classical SVM cost when data samples are drawn from the multi-dimensional Gaussian distributions that form the set of the training examples. Our formulation approximates the classical SVM formulation when the training examples are isotropic Gaussians with variance tending to zero. We arrive at a convex optimization problem, which we solve efficiently in the primal form using a stochastic gradient descent approach. The resulting classifier, which we name SVM with Gaussian Sample Uncertainty (SVM-GSU), is tested on synthetic data and five publicly available and popular datasets; namely, the MNIST, WDBC, DEAP, TV News Channel Commercial Detection, and TRECVID MED datasets. Experimental results verify the effectiveness of the proposed method.
在本文中,我们提出了一种处理数据输入不确定性的最大间隔分类器。更具体地说,我们重新构建了支持向量机(SVM)框架,使得每个训练样本都可以由一个多维高斯分布来建模,该分布由其均值向量和协方差矩阵描述——后者对不确定性进行建模。我们解决分类问题,并定义一个代价函数,该函数是当数据样本从构成训练样本集的多维高斯分布中抽取时,经典SVM代价的期望值。当训练样本是方差趋于零的各向同性高斯分布时,我们的公式近似于经典SVM公式。我们得到了一个凸优化问题,使用随机梯度下降方法在原始形式下有效地求解该问题。我们将得到的分类器命名为具有高斯样本不确定性的支持向量机(SVM-GSU),并在合成数据和五个公开可用的流行数据集上进行了测试;即MNIST、WDBC、DEAP、电视新闻频道商业检测和TRECVID MED数据集。实验结果验证了所提方法的有效性。