Qian Chengde, Tran-Dinh Quoc, Fu Sheng, Zou Changliang, Liu Yufeng
School of Statistics and Data Science, LPMC and KLMDASR, Nankai University, P. R. China.
Department of Statistics and Operations Research, The University of North Carolina at Chapel Hill.
Math Program. 2019 Jul;176(1-2):429-463. doi: 10.1007/s10107-019-01386-z. Epub 2019 Mar 28.
We consider the classification problem when the input features are represented as matrices rather than vectors. To preserve the intrinsic structures for classification, a successful method is the Support Matrix Machine (SMM) in [19], which optimizes an objective function with a hinge loss plus a so-called spectral elastic net penalty. However, the issues of extending SMM to multicategory classification still remain. Moreover, in practice, it is common to see the training data contaminated by outlying observations, which can affect the robustness of existing matrix classification methods. In this paper, we address these issues by introducing a robust angle-based classifier, which boils down binary and multicategory problems to a unified framework. Benefitting from the use of truncated hinge loss functions, the proposed classifier achieves certain robustness to outliers. The underlying optimization model becomes nonconvex, but admits a natural DC (difference of two convex functions) representation. We develop a new and efficient algorithm by incorporating the DC algorithm and primal-dual first-order methods together. The proposed DC algorithm adaptively chooses the accuracy of the subproblem at each iteration while guaranteeing the overall convergence of the algorithm. The use of primal-dual methods removes a natural complexity of the linear operator in the subproblems and enables us to use the proximal operator of the objective functions, and matrix-vector operations. This advantage allows us to solve large-scale problems efficiently. Theoretical and numerical results indicate that for problems with potential outliers, our method can be highly competitive among existing methods.
当输入特征表示为矩阵而非向量时,我们考虑分类问题。为了保留用于分类的内在结构,一种成功的方法是文献[19]中的支持矩阵机(SMM),它通过一个铰链损失加上一个所谓的谱弹性网惩罚来优化目标函数。然而,将SMM扩展到多类别分类的问题仍然存在。此外,在实际中,经常会看到训练数据受到异常观测值的污染,这会影响现有矩阵分类方法的鲁棒性。在本文中,我们通过引入一种基于角度的鲁棒分类器来解决这些问题,该分类器将二分类和多分类问题归结为一个统一的框架。得益于截断铰链损失函数的使用,所提出的分类器对异常值具有一定的鲁棒性。底层的优化模型变为非凸的,但允许一种自然的DC(两个凸函数之差)表示。我们通过将DC算法和原始对偶一阶方法结合起来,开发了一种新的高效算法。所提出的DC算法在每次迭代时自适应地选择子问题的精度,同时保证算法的整体收敛性。原始对偶方法的使用消除了子问题中线性算子的自然复杂性,并使我们能够使用目标函数的近端算子以及矩阵向量运算。这一优势使我们能够高效地解决大规模问题。理论和数值结果表明,对于存在潜在异常值的问题,我们的方法在现有方法中具有很强的竞争力。