具有双重校正的因果特征选择

Causal Feature Selection With Dual Correction.

作者信息

Guo Xianjie, Yu Kui, Liu Lin, Cao Fuyuan, Li Jiuyong

出版信息

IEEE Trans Neural Netw Learn Syst. 2022 Jun 8;PP. doi: 10.1109/TNNLS.2022.3178075.

DOI:10.1109/TNNLS.2022.3178075

Abstract

Causal feature selection methods aim to identify a Markov boundary (MB) of a class variable, and almost all the existing causal feature selection algorithms use conditional independence (CI) tests to learn the MB. However, in real-world applications, due to data issues (e.g., noisy or small samples), CI tests can be unreliable; thus, causal feature selection algorithms relying on CI tests encounter two types of errors: false positives (i.e., selecting false MB features) and false negatives (i.e., discarding true MB features). Existing algorithms only tackle either false positives or false negatives, and they cannot deal with both types of errors at the same time, leading to unsatisfactory results. To address this issue, we propose a dual-correction-strategy-based MB learning (DCMB) algorithm to correct the two types of errors simultaneously. Specifically, DCMB selectively removes false positives from the MB features currently selected, while selectively retrieving false negatives from the features currently discarded. To automatically determine the optimal number of selected features for the selective removal and retrieval in the dual correction strategy, we design the simulated-annealing-based DCMB (SA-DCMB) algorithm. Using benchmark Bayesian network (BN) datasets, the experimental results demonstrate that DCMB achieves substantial improvements on the MB learning accuracy compared with the existing MB learning methods. Empirical studies in real-world datasets validate the effectiveness of SA-DCMB for classification against state-of-the-art causal and traditional feature selection algorithms.

摘要

因果特征选择方法旨在识别类变量的马尔可夫边界（MB），并且几乎所有现有的因果特征选择算法都使用条件独立性（CI）测试来学习MB。然而，在实际应用中，由于数据问题（例如，噪声或小样本），CI测试可能不可靠；因此，依赖CI测试的因果特征选择算法会遇到两种类型的错误：误报（即选择错误的MB特征）和漏报（即丢弃真正的MB特征）。现有算法只处理误报或漏报中的一种，无法同时处理这两种类型的错误，导致结果不尽人意。为了解决这个问题，我们提出了一种基于双校正策略的MB学习（DCMB）算法，以同时校正这两种类型的错误。具体来说，DCMB从当前选择的MB特征中选择性地去除误报，同时从当前丢弃的特征中选择性地找回漏报。为了在双校正策略中自动确定用于选择性去除和找回的最佳选择特征数量，我们设计了基于模拟退火的DCMB（SA-DCMB）算法。使用基准贝叶斯网络（BN）数据集，实验结果表明，与现有的MB学习方法相比，DCMB在MB学习准确性方面取得了显著提高。在真实世界数据集上的实证研究验证了SA-DCMB相对于最先进的因果和传统特征选择算法在分类方面的有效性。

相似文献

Causal Feature Selection With Dual Correction.具有双重校正的因果特征选择

IEEE Trans Neural Netw Learn Syst. 2022 Jun 8;PP. doi: 10.1109/TNNLS.2022.3178075.

Accurate Markov Boundary Discovery for Causal Feature Selection.准确的马尔可夫边界发现因果特征选择。

IEEE Trans Cybern. 2020 Dec;50(12):4983-4996. doi: 10.1109/TCYB.2019.2940509. Epub 2020 Dec 3.

Multilabel Feature Selection: A Local Causal Structure Learning Approach.多标签特征选择：一种局部因果结构学习方法。

IEEE Trans Neural Netw Learn Syst. 2023 Jun;34(6):3044-3057. doi: 10.1109/TNNLS.2021.3111288. Epub 2023 Jun 1.

Online Causal Feature Selection for Streaming Features.在线因果特征选择的流媒体功能。

IEEE Trans Neural Netw Learn Syst. 2023 Mar;34(3):1563-1577. doi: 10.1109/TNNLS.2021.3105585. Epub 2023 Feb 28.

Efficient Markov Blanket Discovery and Its Application.高效马尔可夫毯发现及其应用。

IEEE Trans Cybern. 2017 May;47(5):1169-1179. doi: 10.1109/TCYB.2016.2539338. Epub 2016 Mar 24.

Learning Markov Blankets From Multiple Interventional Data Sets.从多个干预数据集学习马尔可夫毯。

IEEE Trans Neural Netw Learn Syst. 2020 Jun;31(6):2005-2019. doi: 10.1109/TNNLS.2019.2927636. Epub 2019 Aug 28.

Multi-Source Causal Feature Selection.多源因果特征选择。

IEEE Trans Pattern Anal Mach Intell. 2020 Sep;42(9):2240-2256. doi: 10.1109/TPAMI.2019.2908373. Epub 2019 Mar 29.

Feature Selection in the Data Stream Based on Incremental Markov Boundary Learning.基于增量马尔可夫边界学习的数据流特征选择

IEEE Trans Neural Netw Learn Syst. 2023 Oct;34(10):6740-6754. doi: 10.1109/TNNLS.2023.3249767. Epub 2023 Oct 5.

Hybrid Causal Feature Selection for Cancer Biomarker Identification From RNA-Seq Data.用于从RNA测序数据中识别癌症生物标志物的混合因果特征选择

IEEE/ACM Trans Comput Biol Bioinform. 2024 Nov-Dec;21(6):1645-1655. doi: 10.1109/TCBB.2024.3406922. Epub 2024 Dec 10.

A Markov blanket-based method for detecting causal SNPs in GWAS.基于马尔可夫毯的 GWAS 中因果 SNP 检测方法。

BMC Bioinformatics. 2010 Apr 29;11 Suppl 3(Suppl 3):S5. doi: 10.1186/1471-2105-11-S3-S5.

引用本文的文献

ICRL: independent causality representation learning for domain generalization.ICRL：用于领域泛化的独立因果关系表示学习

Sci Rep. 2025 Apr 6;15(1):11771. doi: 10.1038/s41598-025-96357-0.

具有双重校正的因果特征选择

Causal Feature Selection With Dual Correction.

作者信息

Guo Xianjie, Yu Kui, Liu Lin, Cao Fuyuan, Li Jiuyong

出版信息

IEEE Trans Neural Netw Learn Syst. 2022 Jun 8;PP. doi: 10.1109/TNNLS.2022.3178075.

DOI:10.1109/TNNLS.2022.3178075

PMID:35675244

Abstract

摘要

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

具有双重校正的因果特征选择

Causal Feature Selection With Dual Correction.

作者信息

出版信息

相似文献

引用本文的文献

具有双重校正的因果特征选择

Causal Feature Selection With Dual Correction.

作者信息

出版信息

相似文献

引用本文的文献