Yuan Aihong, You Mengbo, He Dongjian, Li Xuelong
IEEE Trans Cybern. 2022 Jun;52(6):5522-5534. doi: 10.1109/TCYB.2020.3034462. Epub 2022 Jun 16.
Unsupervised feature selection (UFS) aims to remove the redundant information and select the most representative feature subset from the original data, so it occupies a core position for high-dimensional data preprocessing. Many proposed approaches use self-expression to explore the correlation between the data samples or use pseudolabel matrix learning to learn the mapping between the data and labels. Furthermore, the existing methods have tried to add constraints to either of these two modules to reduce the redundancy, but no prior literature embeds them into a joint model to select the most representative features by the computed top ranking scores. To address the aforementioned issue, this article presents a novel UFS method via a convex non-negative matrix factorization with an adaptive graph constraint (CNAFS). Through convex matrix factorization with adaptive graph constraint, it can dig up the correlation between the data and keep the local manifold structure of the data. To our knowledge, it is the first work that integrates pseudo label matrix learning into the self-expression module and optimizes them simultaneously for the UFS solution. Besides, two different manifold regularizations are constructed for the pseudolabel matrix and the encoding matrix to keep the local geometrical structure. Eventually, extensive experiments on the benchmark datasets are conducted to prove the effectiveness of our method. The source code is available at: https://github.com/misteru/CNAFS.
无监督特征选择(UFS)旨在去除冗余信息并从原始数据中选择最具代表性的特征子集,因此它在高维数据预处理中占据核心地位。许多已提出的方法使用自表达来探索数据样本之间的相关性,或使用伪标签矩阵学习来学习数据与标签之间的映射。此外,现有方法试图对这两个模块中的任何一个添加约束以减少冗余,但没有先前的文献将它们嵌入到联合模型中,通过计算的顶级排名分数来选择最具代表性的特征。为了解决上述问题,本文提出了一种通过具有自适应图约束的凸非负矩阵分解(CNAFS)的新颖UFS方法。通过具有自适应图约束的凸矩阵分解,它可以挖掘数据之间的相关性并保持数据的局部流形结构。据我们所知,这是第一项将伪标签矩阵学习集成到自表达模块中并同时针对UFS解决方案对其进行优化的工作。此外,为伪标签矩阵和编码矩阵构建了两种不同的流形正则化,以保持局部几何结构。最终,在基准数据集上进行了广泛的实验,以证明我们方法的有效性。源代码可在以下网址获取:https://github.com/misteru/CNAFS。