Ruan Jinghou, Wang Mingwei, Liu Deqing, Chen Maolin, Gao Xianjun
School of Computer Science, Hubei University of Technology, Wuhan 430068, China.
School of Smart City, Chongqing Jiaotong University, Chongqing 400074, China.
Entropy (Basel). 2024 Nov 18;26(11):992. doi: 10.3390/e26110992.
In multi-label data, a sample is associated with multiple labels at the same time, and the computational complexity is manifested in the high-dimensional feature space as well as the interdependence and unbalanced distribution of labels, which leads to challenges regarding feature selection. As a result, a multi-label feature selection method based on feature-label subgraph association with graph representation learning (SAGRL) is proposed to represent the complex correlations of features and labels, especially the relationships between features and labels. Specifically, features and labels are mapped to nodes in the graph structure, and the connections between nodes are established to form feature and label sets, respectively, which increase intra-class correlation and decrease inter-class correlation. Further, feature-label subgraphs are constructed by feature and label sets to provide abundant feature combinations. The relationship between each subgraph is adjusted by graph representation learning, the crucial features in different label sets are selected, and the optimal feature subset is obtained by ranking. Experimental studies on 11 datasets show the superior performance of the proposed method with six evaluation metrics over some state-of-the-art multi-label feature selection methods.
在多标签数据中,一个样本同时与多个标签相关联,计算复杂性体现在高维特征空间以及标签的相互依存和不平衡分布上,这给特征选择带来了挑战。因此,提出了一种基于特征-标签子图关联和图表示学习的多标签特征选择方法(SAGRL),以表示特征和标签的复杂相关性,特别是特征与标签之间的关系。具体来说,将特征和标签映射到图结构中的节点,分别建立节点之间的连接以形成特征集和标签集,这增加了类内相关性并降低了类间相关性。此外,通过特征集和标签集构建特征-标签子图,以提供丰富的特征组合。通过图表示学习调整每个子图之间的关系,选择不同标签集中的关键特征,并通过排序获得最优特征子集。在11个数据集上的实验研究表明,与一些现有多标签特征选择方法相比,该方法在六个评估指标上具有优越的性能。