Suppr超能文献

流特征选择中的特征交互。

Feature Interaction for Streaming Feature Selection.

出版信息

IEEE Trans Neural Netw Learn Syst. 2021 Oct;32(10):4691-4702. doi: 10.1109/TNNLS.2020.3025922. Epub 2021 Oct 5.

Abstract

Traditional feature selection methods assume that all data instances and features are known before learning. However, it is not the case in many real-world applications that we are more likely faced with data streams or feature streams or both. Feature streams are defined as features that flow in one by one over time, whereas the number of training examples remains fixed. Existing streaming feature selection methods focus on removing irrelevant and redundant features and selecting the most relevant features, but they ignore the interaction between features. A feature might have little correlation with the target concept by itself, but, when it is combined with some other features, they can be strongly correlated with the target concept. In other words, the interactive features contribute to the target concept as an integer greater than the sum of individuals. Nevertheless, most of the existing streaming feature selection methods treat features individually, but it is necessary to consider the interaction between features. In this article, we focus on the problem of feature interaction in feature streams and propose a new streaming feature selection method that can select features to interact with each other, named Streaming Feature Selection considering Feature Interaction (SFS-FI). With the formal definition of feature interaction, we design a new metric named interaction gain that can measure the interaction degree between the new arriving feature and the selected feature subset. Besides, we analyzed and demonstrated the relationship between feature relevance and feature interaction. Extensive experiments conducted on 14 real-world microarray data sets indicate the efficiency of our new method.

摘要

传统的特征选择方法假设在学习之前就已经知道所有的数据实例和特征。然而,在许多现实世界的应用中,我们更有可能面临数据流或特征流,或者两者兼而有之。特征流被定义为随着时间的推移逐个流入的特征,而训练示例的数量保持不变。现有的流特征选择方法侧重于去除不相关和冗余的特征,并选择最相关的特征,但它们忽略了特征之间的交互作用。一个特征本身可能与目标概念相关性不大,但当它与其他一些特征结合时,它们可能与目标概念有很强的相关性。换句话说,交互特征作为一个大于个体总和的整数对目标概念有贡献。然而,大多数现有的流特征选择方法都是单独处理特征的,但有必要考虑特征之间的交互作用。在本文中,我们关注特征流中的特征交互问题,并提出了一种新的流特征选择方法,可以选择相互交互的特征,称为考虑特征交互的流特征选择(SFS-FI)。通过特征交互的形式化定义,我们设计了一个新的度量标准,称为交互增益,它可以衡量新到达的特征与所选特征子集之间的交互程度。此外,我们还分析并证明了特征相关性和特征交互之间的关系。在 14 个真实的微阵列数据集上进行的广泛实验表明了我们新方法的效率。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验