Fu Zhouyu, Robles-Kelly Antonio, Zhou Jun
Australian National University, Canberra ACT, Australia.
IEEE Trans Neural Netw. 2010 Dec;21(12):1963-75. doi: 10.1109/TNN.2010.2080319. Epub 2010 Nov 11.
In this paper, we address the problem of combining linear support vector machines (SVMs) for classification of large-scale nonlinear datasets. The motivation is to exploit both the efficiency of linear SVMs (LSVMs) in learning and prediction and the power of nonlinear SVMs in classification. To this end, we develop a LSVM mixture model that exploits a divide-and-conquer strategy by partitioning the feature space into subregions of linearly separable datapoints and learning a LSVM for each of these regions. We do this implicitly by deriving a generative model over the joint data and label distributions. Consequently, we can impose priors on the mixing coefficients and do implicit model selection in a top-down manner during the parameter estimation process. This guarantees the sparsity of the learned model. Experimental results show that the proposed method can achieve the efficiency of LSVMs in the prediction phase while still providing a classification performance comparable to nonlinear SVMs.
在本文中,我们解决了结合线性支持向量机(SVM)对大规模非线性数据集进行分类的问题。其动机是利用线性SVM(LSVM)在学习和预测方面的效率以及非线性SVM在分类方面的强大能力。为此,我们开发了一种LSVM混合模型,该模型通过将特征空间划分为线性可分数据点的子区域并为每个区域学习一个LSVM来利用分而治之的策略。我们通过推导联合数据和标签分布上的生成模型来隐式地做到这一点。因此,我们可以对混合系数施加先验,并在参数估计过程中以自上而下的方式进行隐式模型选择。这保证了学习到的模型的稀疏性。实验结果表明,所提出的方法在预测阶段可以实现LSVM的效率,同时仍提供与非线性SVM相当的分类性能。