Suppr超能文献

变量选择的信息论方法——综述

Information Theoretic Methods for Variable Selection-A Review.

作者信息

Mielniczuk Jan

机构信息

Institute of Computer Science, Polish Academy of Sciences, Jana Kazimierza 5, 01-248 Warsaw, Poland.

Faculty of Mathematics and Information Science, Warsaw University of Technology, Koszykowa 75, 00-662 Warsaw, Poland.

出版信息

Entropy (Basel). 2022 Aug 4;24(8):1079. doi: 10.3390/e24081079.

Abstract

We review the principal information theoretic tools and their use for feature selection, with the main emphasis on classification problems with discrete features. Since it is known that empirical versions of conditional mutual information perform poorly for high-dimensional problems, we focus on various ways of constructing its counterparts and the properties and limitations of such methods. We present a unified way of constructing such measures based on truncation, or truncation and weighing, for the Möbius expansion of conditional mutual information. We also discuss the main approaches to feature selection which apply the introduced measures of conditional dependence, together with the ways of assessing the quality of the obtained vector of predictors. This involves discussion of recent results on asymptotic distributions of empirical counterparts of criteria, as well as advances in resampling.

摘要

我们回顾了主要的信息论工具及其在特征选择中的应用,主要侧重于具有离散特征的分类问题。由于已知条件互信息的经验版本在高维问题上表现不佳,我们重点关注构建其对应物的各种方法以及此类方法的性质和局限性。我们提出了一种基于截断或截断与加权的统一方法,用于条件互信息的莫比乌斯展开。我们还讨论了应用引入的条件依赖度量的主要特征选择方法,以及评估所得预测变量向量质量的方法。这涉及对准则经验对应物的渐近分布的最新结果的讨论,以及重采样方面的进展。

相似文献

1
Information Theoretic Methods for Variable Selection-A Review.
Entropy (Basel). 2022 Aug 4;24(8):1079. doi: 10.3390/e24081079.
2
Analysis of Information-Based Nonparametric Variable Selection Criteria.
Entropy (Basel). 2020 Aug 31;22(9):974. doi: 10.3390/e22090974.
3
Simple Stopping Criteria for Information Theoretic Feature Selection.
Entropy (Basel). 2019 Jan 21;21(1):99. doi: 10.3390/e21010099.
4
Model-Free Conditional Independence Feature Screening For Ultrahigh Dimensional Data.
Sci China Math. 2017 Mar;60(3):551-568. doi: 10.1007/s11425-016-0186-8. Epub 2016 Dec 29.
5
Markov Blanket Feature Selection Using Representative Sets.
IEEE Trans Neural Netw Learn Syst. 2017 Nov;28(11):2775-2788. doi: 10.1109/TNNLS.2016.2602365.
6
A Feature Selection Algorithm Integrating Maximum Classification Information and Minimum Interaction Feature Dependency Information.
Comput Intell Neurosci. 2021 Dec 28;2021:3569632. doi: 10.1155/2021/3569632. eCollection 2021.
8
Accelerating Causal Inference and Feature Selection Methods through G-Test Computation Reuse.
Entropy (Basel). 2021 Nov 12;23(11):1501. doi: 10.3390/e23111501.
9
Learning dependence from samples.
Int J Bioinform Res Appl. 2014;10(1):43-58. doi: 10.1504/IJBRA.2014.058777.
10
Structure Learning of Bayesian Network Based on Adaptive Thresholding.
Entropy (Basel). 2019 Jul 8;21(7):665. doi: 10.3390/e21070665.

本文引用的文献

1
Fast and powerful conditional randomization testing via distillation.
Biometrika. 2022 Jun;109(2):277-293. doi: 10.1093/biomet/asab039. Epub 2021 Jul 8.
2
Analysis of Information-Based Nonparametric Variable Selection Criteria.
Entropy (Basel). 2020 Aug 31;22(9):974. doi: 10.3390/e22090974.
3
4
Efficient Markov Blanket Discovery and Its Application.
IEEE Trans Cybern. 2017 May;47(5):1169-1179. doi: 10.1109/TCYB.2016.2539338. Epub 2016 Mar 24.
5
BOOST: A fast approach to detecting gene-gene interactions in genome-wide case-control studies.
Am J Hum Genet. 2010 Sep 10;87(3):325-40. doi: 10.1016/j.ajhg.2010.07.021.
6
Normalized mutual information feature selection.
IEEE Trans Neural Netw. 2009 Feb;20(2):189-201. doi: 10.1109/TNN.2008.2005601. Epub 2009 Jan 13.
8
Using mutual information for selecting features in supervised neural net learning.
IEEE Trans Neural Netw. 1994;5(4):537-50. doi: 10.1109/72.298224.
9
The use of the restricted partition method with case-control data.
Hum Hered. 2007;63(2):93-100. doi: 10.1159/000099181. Epub 2007 Feb 2.
10
Reducing the dimensionality of data with neural networks.
Science. 2006 Jul 28;313(5786):504-7. doi: 10.1126/science.1127647.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验