Xu Meixiang, Zhu Zhenfeng, Zhang Xingxing, Zhao Yao, Li Xuelong
IEEE Trans Cybern. 2019 Apr 4. doi: 10.1109/TCYB.2019.2904753.
For many machine learning algorithms, their success heavily depends on data representation. In this paper, we present an l2,1-norm constrained canonical correlation analysis (CCA) model, that is, L2,1-CCA, toward discovering compact and discriminative representation for the data associated with multiple views. To well exploit the complementary and coherent information across multiple views, the l2,1-norm is employed to constrain the canonical loadings and measure the canonical correlation loss term simultaneously. It enables, on the one hand, the canonical loadings to be with the capacity of variable selection for facilitating the interpretability of the learned canonical variables, and on the other hand, the learned canonical common representation keeps highly consistent with the most canonical variables from each view of the data. Meanwhile, the proposed L2,1-CCA can also be provided with the desired insensitivity to noise (outliers) to some degree. To solve the optimization problem, we develop an efficient alternating optimization algorithm and give its convergence analysis both theoretically and experimentally. Considerable experiment results on several real-world datasets have demonstrated that L2,1-CCA can achieve competitive or better performance in comparison with some representative approaches for multiview representation learning.
对于许多机器学习算法而言,它们的成功很大程度上取决于数据表示。在本文中,我们提出了一种l2,1范数约束的典型相关分析(CCA)模型,即L2,1-CCA,用于发现与多视图相关的数据的紧凑且有区分性的表示。为了充分利用多视图间的互补和连贯信息,采用l2,1范数来约束典型载荷并同时度量典型相关损失项。一方面,它使典型载荷具有变量选择能力,便于解释所学习的典型变量;另一方面,所学习的典型公共表示与数据各视图中最典型的变量高度一致。同时,所提出的L2,1-CCA在一定程度上还具有对噪声(离群值)的期望不敏感性。为了解决优化问题,我们开发了一种高效的交替优化算法,并从理论和实验两方面给出其收敛性分析。在多个真实世界数据集上的大量实验结果表明,与一些用于多视图表示学习的代表性方法相比,L2,1-CCA能够取得有竞争力的或更好的性能。