IEEE Trans Neural Netw Learn Syst. 2023 Jun;34(6):3044-3057. doi: 10.1109/TNNLS.2021.3111288. Epub 2023 Jun 1.
Multilabel feature selection plays an essential role in high-dimensional multilabel learning tasks. Existing multilabel feature selection approaches mainly either explore the feature-label and feature-feature correlations or the label-label and feature-feature correlations. A few of them are able to deal with all three types of correlations simultaneously. To address this problem, in this article, we formulate multilabel feature selection as a local causal structure learning problem and propose a novel algorithm, M2LC. By learning the local causal structure of each class label, M2LC considers three types of feature relationships simultaneously and is scalable to high-dimensional datasets as well. To tackle false discoveries caused by the label-label correlations, M2LC consists of two novel error-correction subroutines to correct those false discoveries. Through local causal structure learning, M2LC learns the causal mechanism behind data, and thus, it can select causally informative features and visualize common features shared by class labels and specific features owned by an individual class label using the learned causal structures. Extensive experiments have been conducted to evaluate M2LC in comparison with the state-of-the-art multilabel feature selection algorithms.
多标签特征选择在高维多标签学习任务中起着至关重要的作用。现有的多标签特征选择方法主要探索特征-标签和特征-特征之间的相关性,或者标签-标签和特征-特征之间的相关性。其中少数方法能够同时处理这三种相关性。为了解决这个问题,本文将多标签特征选择形式化为局部因果结构学习问题,并提出了一种新的算法 M2LC。通过学习每个类别标签的局部因果结构,M2LC同时考虑了三种类型的特征关系,并且能够扩展到高维数据集。为了解决标签-标签相关性引起的假发现问题,M2LC 包含两个新的错误纠正子例程来纠正这些假发现。通过局部因果结构学习,M2LC 学习了数据背后的因果机制,因此,它可以选择因果信息丰富的特征,并使用学习到的因果结构可视化类别标签和个体类别标签所拥有的特定特征之间的共同特征。已经进行了广泛的实验来评估 M2LC 与最先进的多标签特征选择算法相比的性能。