Tao Qing, Wu Gaowei, Chu Dejun
IEEE Trans Neural Netw Learn Syst. 2018 Jul;29(7):2782-2793. doi: 10.1109/TNNLS.2017.2705429. Epub 2017 Jun 6.
The truncated regular -loss support vector machine can eliminate the excessive number of support vectors (SVs); thus, it has significant advantages in robustness and scalability. However, in this paper, we discover that the associated state-of-the-art solvers, such as difference convex algorithm and concave-convex procedure, not only have limited sparsity promoting property for general truncated losses especially the -loss but also have poor scalability for large-scale problems. To circumvent these drawbacks, we present a general multistage scheme with explicit interpretation regarding SVs as well as outliers. In particular, we solve the general nonconvex truncated loss minimization through a sequence of associated convex subproblems, in which the outliers are removed in advance. The proposed algorithm can be regarded as a structural optimization attempt carefully considering sparsity imposed by the nonconvex truncated losses. We show that this general multistage algorithm offers sufficient sparsity especially for the truncated -loss. To further improve the scalability, we propose a linear multistep algorithm by employing a single iteration of coordinate descent to monotonically decrease the objective function at each stage and a kernel algorithm by using the Karush-Kuhn-Tucker conditions to cheaply find most part of the outliers for the next stage. Comparison experiments demonstrate that our methods have superiority in sparsity as well as efficiency in scalability.
截断正则损失支持向量机可以消除过多的支持向量(SVs);因此,它在鲁棒性和可扩展性方面具有显著优势。然而,在本文中,我们发现相关的最先进求解器,如差分凸算法和凹凸过程,不仅对一般截断损失(特别是 -损失)的稀疏性促进特性有限,而且对于大规模问题的可扩展性也很差。为了克服这些缺点,我们提出了一种通用的多阶段方案,对支持向量和异常值都有明确的解释。具体来说,我们通过一系列相关的凸子问题来解决一般的非凸截断损失最小化问题,其中异常值会提前被去除。所提出的算法可以被视为一种仔细考虑非凸截断损失所施加稀疏性的结构优化尝试。我们表明,这种通用的多阶段算法具有足够的稀疏性,特别是对于截断 -损失。为了进一步提高可扩展性,我们提出了一种线性多步算法,通过在每个阶段采用一次坐标下降迭代来单调降低目标函数,以及一种核算法,通过使用Karush-Kuhn-Tucker条件来廉价地找到下一阶段的大部分异常值。比较实验表明,我们的方法在稀疏性以及可扩展性效率方面具有优势。