Suppr超能文献

概率分类向量机

Probabilistic classification vector machines.

作者信息

Chen Huanhuan, Tino Peter, Yao Xin

机构信息

Centre of Excellence for Research in Computational Intelligence and Applications (CERCIA), School of Computer Science, University of Birmingham, Birmingham B15 2TT, UK.

出版信息

IEEE Trans Neural Netw. 2009 Jun;20(6):901-14. doi: 10.1109/TNN.2009.2014161. Epub 2009 Apr 24.

Abstract

In this paper, a sparse learning algorithm, probabilistic classification vector machines (PCVMs), is proposed. We analyze relevance vector machines (RVMs) for classification problems and observe that adopting the same prior for different classes may lead to unstable solutions. In order to tackle this problem, a signed and truncated Gaussian prior is adopted over every weight in PCVMs, where the sign of prior is determined by the class label, i.e., +1 or -1. The truncated Gaussian prior not only restricts the sign of weights but also leads to a sparse estimation of weight vectors, and thus controls the complexity of the model. In PCVMs, the kernel parameters can be optimized simultaneously within the training algorithm. The performance of PCVMs is extensively evaluated on four synthetic data sets and 13 benchmark data sets using three performance metrics, error rate (ERR), area under the curve of receiver operating characteristic (AUC), and root mean squared error (RMSE). We compare PCVMs with soft-margin support vector machines (SVM(Soft)), hard-margin support vector machines (SVM(Hard)), SVM with the kernel parameters optimized by PCVMs (SVM(PCVM)), relevance vector machines (RVMs), and some other baseline classifiers. Through five replications of twofold cross-validation F test, i.e., 5 x 2 cross-validation F test, over single data sets and Friedman test with the corresponding post-hoc test to compare these algorithms over multiple data sets, we notice that PCVMs outperform other algorithms, including SVM(Soft), SVM(Hard), RVM, and SVM(PCVM), on most of the data sets under the three metrics, especially under AUC. Our results also reveal that the performance of SVM(PCVM) is slightly better than SVM(Soft), implying that the parameter optimization algorithm in PCVMs is better than cross validation in terms of performance and computational complexity. In this paper, we also discuss the superiority of PCVMs' formulation using maximum a posteriori (MAP) analysis and margin analysis, which explain the empirical success of PCVMs.

摘要

本文提出了一种稀疏学习算法——概率分类向量机(PCVMs)。我们分析了用于分类问题的相关向量机(RVMs),并观察到对不同类别采用相同的先验可能会导致不稳定的解。为了解决这个问题,PCVMs对每个权重采用了有符号截断高斯先验,其中先验的符号由类别标签(即 +1 或 -1)确定。截断高斯先验不仅限制了权重的符号,还导致权重向量的稀疏估计,从而控制了模型的复杂度。在PCVMs中,核参数可以在训练算法中同时进行优化。使用三种性能指标,即错误率(ERR)、接收者操作特征曲线下面积(AUC)和均方根误差(RMSE),在四个合成数据集和13个基准数据集上对PCVMs的性能进行了广泛评估。我们将PCVMs与软间隔支持向量机(SVM(Soft))、硬间隔支持向量机(SVM(Hard))、核参数由PCVMs优化的支持向量机(SVM(PCVM))、相关向量机(RVMs)以及其他一些基线分类器进行了比较。通过对单个数据集进行五次重复的二折交叉验证F检验,即5×2交叉验证F检验,并对多个数据集进行Friedman检验及相应的事后检验来比较这些算法,我们注意到在这三个指标下,尤其是在AUC指标下,PCVMs在大多数数据集上优于其他算法,包括SVM(Soft)、SVM(Hard)、RVM和SVM(PCVM)。我们的结果还表明,SVM(PCVM)的性能略优于SVM(Soft),这意味着PCVMs中的参数优化算法在性能和计算复杂度方面优于交叉验证。在本文中,我们还使用最大后验(MAP)分析和间隔分析讨论了PCVMs公式的优越性,这解释了PCVMs在经验上的成功。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验