IEEE Trans Neural Netw Learn Syst. 2018 May;29(5):1876-1887. doi: 10.1109/TNNLS.2017.2688182. Epub 2017 Apr 11.
The twin support vector machine (TSVM) is widely used in classification problems, but it is not efficient enough for large-scale data sets. Furthermore, to get the optimal parameter, the exhaustive grid search method is applied to TSVM. It is very time-consuming, especially for multiparameter models. Although many techniques have been presented to solve these problems, all of them always affect the performance of TSVM to some extent. In this paper, we propose a safe screening rule (SSR) for linear-TSVM, and give a modified SSR (MSSR) for nonlinear TSVM, which contains multiple parameters. The SSR and MSSR can delete most training samples and reduce the scale of TSVM before solving it. Sequential versions of SSR and MSSR are further introduced to substantially accelerate the whole parameter tuning process. One important advantage of SSR and MSSR is that they are safe, i.e., we can obtain the same solution as the original problem by utilizing them. Experiments on eight real-world data sets and an imbalanced data set with different imbalanced ratios demonstrate the efficiency and safety of SSR and MSSR.
孪生支持向量机(TSVM)广泛应用于分类问题,但对于大规模数据集效率不够高。此外,为了获得最优参数,采用穷举网格搜索法来应用于 TSVM。这非常耗时,特别是对于多参数模型。尽管已经提出了许多技术来解决这些问题,但它们都在某种程度上影响了 TSVM 的性能。在本文中,我们提出了一种用于线性-TSVM 的安全筛选规则(SSR),并为包含多个参数的非线性 TSVM 提供了一种改进的 SSR(MSSR)。SSR 和 MSSR 可以在解决 TSVM 之前删除大部分训练样本并缩小其规模。进一步引入 SSR 和 MSSR 的顺序版本,可以大大加速整个参数调整过程。SSR 和 MSSR 的一个重要优点是它们是安全的,即我们可以通过使用它们来获得与原始问题相同的解决方案。在八个真实数据集和一个具有不同不平衡比的不平衡数据集上的实验证明了 SSR 和 MSSR 的效率和安全性。