Lu Jin, Liang Guannan, Sun Jiangwen, Bi Jinbo
University of Connecticut, Storrs, CT 06269.
Adv Neural Inf Process Syst. 2016;29:4071-4079.
Matrix completion methods can benefit from side information besides the partially observed matrix. The use of side features that describe the row and column entities of a matrix has been shown to reduce the sample complexity for completing the matrix. We propose a novel sparse formulation that explicitly models the interaction between the row and column side features to approximate the matrix entries. Unlike early methods, this model does not require the low rank condition on the model parameter matrix. We prove that when the side features span the latent feature space of the matrix to be recovered, the number of observed entries needed for an exact recovery is (log ) where is the size of the matrix. If the side features are corrupted latent features of the matrix with a small perturbation, our method can achieve an -recovery with (log ) sample complexity. If side information is useless, our method maintains a () sampling rate similar to classic methods. An efficient linearized Lagrangian algorithm is developed with a convergence guarantee. Empirical results show that our approach outperforms three state-of-the-art methods both in simulations and on real world datasets.
矩阵补全方法除了可以利用部分观测矩阵之外,还能从辅助信息中受益。已证明,使用描述矩阵行和列实体的辅助特征可以降低补全矩阵所需的样本复杂度。我们提出了一种新颖的稀疏公式,该公式明确地对行和列辅助特征之间的相互作用进行建模,以近似矩阵元素。与早期方法不同,该模型不要求模型参数矩阵满足低秩条件。我们证明,当辅助特征跨越要恢复的矩阵的潜在特征空间时,精确恢复所需的观测元素数量为 (log ),其中 是矩阵的大小。如果辅助特征是带有小扰动的矩阵的损坏潜在特征,我们的方法可以通过 (log ) 的样本复杂度实现 -恢复。如果辅助信息无用,我们的方法保持与经典方法类似的 () 采样率。我们开发了一种具有收敛保证的高效线性化拉格朗日算法。实证结果表明,我们的方法在模拟和真实世界数据集上均优于三种最新方法。