Yao Quanming, Yang Hansi, Hu En-Liang, Kwok James T
IEEE Trans Pattern Anal Mach Intell. 2022 Oct;44(10):6153-6168. doi: 10.1109/TPAMI.2021.3085858. Epub 2022 Sep 14.
In real-world applications, it is important for machine learning algorithms to be robust against data outliers or corruptions. In this paper, we focus on improving the robustness of a large class of learning algorithms that are formulated as low-rank semi-definite programming (SDP) problems. Traditional formulations use the square loss, which is notorious for being sensitive to outliers. We propose to replace this with more robust noise models, including the l-loss and other nonconvex losses. However, the resultant optimization problem becomes difficult as the objective is no longer convex or smooth. To alleviate this problem, we design an efficient algorithm based on majorization-minimization. The crux is on constructing a good optimization surrogate, and we show that this surrogate can be efficiently obtained by the alternating direction method of multipliers (ADMM). By properly monitoring ADMM's convergence, the proposed algorithm is empirically efficient and also theoretically guaranteed to converge to a critical point. Extensive experiments are performed on four machine learning applications using both synthetic and real-world data sets. Results show that the proposed algorithm is not only fast but also has better performance than the state-of-the-arts.
在实际应用中,机器学习算法对于数据异常值或损坏具有鲁棒性非常重要。在本文中,我们专注于提高一大类被表述为低秩半定规划(SDP)问题的学习算法的鲁棒性。传统的公式使用平方损失,它因对异常值敏感而声名狼藉。我们建议用更鲁棒的噪声模型来替代它,包括l损失和其他非凸损失。然而,由于目标不再是凸的或光滑的,由此产生的优化问题变得困难。为了缓解这个问题,我们基于主元化最小化设计了一种高效算法。关键在于构建一个良好的优化替代函数,并且我们表明这个替代函数可以通过乘子交替方向法(ADMM)有效地获得。通过适当地监测ADMM的收敛情况,所提出的算法在经验上是高效的,并且在理论上也保证收敛到一个临界点。使用合成数据集和真实世界数据集在四个机器学习应用上进行了广泛的实验。结果表明,所提出的算法不仅速度快,而且性能优于现有技术。