Yang Wanqi, Shi Yinghuan, Gao Yang, Wang Lei, Yang Ming
IEEE Trans Neural Netw Learn Syst. 2018 Dec;29(12):6276-6291. doi: 10.1109/TNNLS.2018.2828699. Epub 2018 May 17.
For dimension reduction on multiview data, most of the previous studies implicitly take an assumption that all samples are completed in all views. Nevertheless, this assumption could often be violated in real applications due to the presence of noise, limited access to data, equipment malfunction, and so on. Most of the previous methods will cease to work when missing values in one or multiple views occur, thus an incomplete-data oriented dimension reduction becomes an important issue. To this end, we mathematically formulate the above-mentioned issue as sparse low-rank representation through multiview subspace (SRRS) learning to impute missing values, by jointly measuring intraview relations (via sparse low-rank representation) and interview relations (through common subspace representation). Moreover, by exploiting various subspace priors in the proposed SRRS formulation, we develop three novel dimension reduction methods for incomplete multiview data: 1) multiview subspace learning via graph embedding; 2) multiview subspace learning via structured sparsity; and 3) sparse multiview feature selection via rank minimization. For each of them, the objective function and the algorithm to solve the resulting optimization problem are elaborated, respectively. We perform extensive experiments to investigate their performance on three types of tasks including data recovery, clustering, and classification. Both two toy examples (i.e., Swiss roll and -curve) and four real-world data sets (i.e., face images, multisource news, multicamera activity, and multimodality neuroimaging data) are systematically tested. As demonstrated, our methods achieve the performance superior to that of the state-of-the-art comparable methods. Also, the results clearly show the advantage of integrating the sparsity and low-rankness over using each of them separately.
对于多视图数据的降维,大多数先前的研究隐含地假设所有样本在所有视图中都是完整的。然而,由于存在噪声、数据访问受限、设备故障等原因,在实际应用中这个假设常常会被违反。当一个或多个视图中出现缺失值时,大多数先前的方法将不再起作用,因此面向不完整数据的降维成为一个重要问题。为此,我们通过多视图子空间(SRRS)学习将上述问题数学地表述为稀疏低秩表示,以插补缺失值,通过联合测量视图内关系(通过稀疏低秩表示)和视图间关系(通过公共子空间表示)。此外,通过在所提出的SRRS公式中利用各种子空间先验,我们为不完整的多视图数据开发了三种新颖的降维方法:1)通过图嵌入的多视图子空间学习;2)通过结构化稀疏性的多视图子空间学习;3)通过秩最小化的稀疏多视图特征选择。对于每一种方法,分别阐述了目标函数和解决由此产生的优化问题的算法。我们进行了广泛的实验来研究它们在包括数据恢复、聚类和分类在内的三种类型任务上的性能。系统地测试了两个玩具示例(即瑞士卷和 -曲线)和四个真实世界数据集(即面部图像、多源新闻、多摄像头活动和多模态神经成像数据)。结果表明,我们的方法取得了优于现有可比方法的性能。此外,结果清楚地表明了将稀疏性和低秩性结合起来使用比单独使用它们各自的优势。