Yang Zuyuan, Liang Naiyao, Yan Wei, Li Zhenni, Xie Shengli
IEEE Trans Cybern. 2021 Jun;51(6):3249-3262. doi: 10.1109/TCYB.2020.2984552. Epub 2021 May 18.
Multiview data processing has attracted sustained attention as it can provide more information for clustering. To integrate this information, one often utilizes the non-negative matrix factorization (NMF) scheme which can reduce the data from different views into the subspace with the same dimension. Motivated by the clustering performance being affected by the distribution of the data in the learned subspace, a tri-factorization-based NMF model with an embedding matrix is proposed in this article. This model tends to generate decompositions with uniform distribution, such that the learned representations are more discriminative. As a result, the obtained consensus matrix can be a better representative of the multiview data in the subspace, leading to higher clustering performance. Also, a new lemma is proposed to provide the formulas about the partial derivation of the trace function with respect to an inner matrix, together with its theoretical proof. Based on this lemma, a gradient-based algorithm is developed to solve the proposed model, and its convergence and computational complexity are analyzed. Experiments on six real-world datasets are performed to show the advantages of the proposed algorithm, with comparison to the existing baseline methods.
多视图数据处理因其能为聚类提供更多信息而持续受到关注。为整合这些信息,人们常采用非负矩阵分解(NMF)方案,该方案可将来自不同视图的数据降维到相同维度的子空间。受聚类性能受所学子空间中数据分布影响的启发,本文提出了一种基于三因子分解且带有嵌入矩阵的NMF模型。该模型倾向于生成具有均匀分布的分解,使得所学表示更具判别力。结果,所获得的一致性矩阵能更好地代表子空间中的多视图数据,从而带来更高的聚类性能。此外,还提出了一个新引理,给出了关于迹函数相对于内部矩阵的偏导数公式及其理论证明。基于此引理,开发了一种基于梯度的算法来求解所提出的模型,并分析了其收敛性和计算复杂度。通过在六个真实世界数据集上进行实验来展示所提算法的优势,并与现有的基线方法进行比较。