IEEE Trans Neural Netw Learn Syst. 2015 May;26(5):1086-97. doi: 10.1109/TNNLS.2014.2333879. Epub 2015 Mar 2.
Least squares support vector machines (LSSVMs) have been widely applied for classification and regression with comparable performance with SVMs. The LSSVM model lacks sparsity and is unable to handle large-scale data due to computational and memory constraints. A primal fixed-size LSSVM (PFS-LSSVM) introduce sparsity using Nyström approximation with a set of prototype vectors (PVs). The PFS-LSSVM model solves an overdetermined system of linear equations in the primal. However, this solution is not the sparsest. We investigate the sparsity-error tradeoff by introducing a second level of sparsity. This is done by means of L0 -norm-based reductions by iteratively sparsifying LSSVM and PFS-LSSVM models. The exact choice of the cardinality for the initial PV set is not important then as the final model is highly sparse. The proposed method overcomes the problem of memory constraints and high computational costs resulting in highly sparse reductions to LSSVM models. The approximations of the two models allow to scale the models to large-scale datasets. Experiments on real-world classification and regression data sets from the UCI repository illustrate that these approaches achieve sparse models without a significant tradeoff in errors.
最小二乘支持向量机 (LSSVM) 已被广泛应用于分类和回归,其性能可与 SVM 相媲美。LSSVM 模型缺乏稀疏性,并且由于计算和内存限制,无法处理大规模数据。原始固定大小 LSSVM (PFS-LSSVM) 使用一组原型向量 (PVs) 通过 Nyström 逼近引入稀疏性。PFS-LSSVM 模型在原始问题中求解超定线性方程组。然而,这个解并不是最稀疏的。我们通过引入第二层稀疏性来研究稀疏性误差的权衡。这是通过迭代稀疏化 LSSVM 和 PFS-LSSVM 模型的 L0 范数减少来完成的。然后,初始 PV 集的基数的精确选择并不重要,因为最终模型是高度稀疏的。所提出的方法克服了内存限制和高计算成本的问题,导致 LSSVM 模型的高度稀疏减少。这两个模型的逼近允许将模型扩展到大规模数据集。来自 UCI 存储库的真实世界分类和回归数据集的实验表明,这些方法可以在不显著影响误差的情况下实现稀疏模型。