Suppr超能文献

基于 F 测度优化的代价敏感特征选择

Cost-Sensitive Feature Selection by Optimizing F-Measures.

出版信息

IEEE Trans Image Process. 2018 Mar;27(3):1323-1335. doi: 10.1109/TIP.2017.2781298. Epub 2017 Dec 8.

Abstract

Feature selection is beneficial for improving the performance of general machine learning tasks by extracting an informative subset from the high-dimensional features. Conventional feature selection methods usually ignore the class imbalance problem, thus the selected features will be biased towards the majority class. Considering that F-measure is a more reasonable performance measure than accuracy for imbalanced data, this paper presents an effective feature selection algorithm that explores the class imbalance issue by optimizing F-measures. Since F-measure optimization can be decomposed into a series of cost-sensitive classification problems, we investigate the cost-sensitive feature selection by generating and assigning different costs to each class with rigorous theory guidance. After solving a series of cost-sensitive feature selection problems, features corresponding to the best F-measure will be selected. In this way, the selected features will fully represent the properties of all classes. Experimental results on popular benchmarks and challenging real-world data sets demonstrate the significance of cost-sensitive feature selection for the imbalanced data setting and validate the effectiveness of the proposed method.

摘要

特征选择通过从高维特征中提取信息子集,有利于提高一般机器学习任务的性能。传统的特征选择方法通常忽略了类别不平衡问题,因此选择的特征将偏向多数类。考虑到 F 值是一种比准确率更适合不平衡数据的性能度量,本文提出了一种有效的特征选择算法,通过优化 F 值来探索类别不平衡问题。由于 F 值优化可以分解为一系列代价敏感的分类问题,我们通过生成并为每个类分配不同的代价,在严格的理论指导下研究代价敏感的特征选择。在解决了一系列代价敏感的特征选择问题后,选择对应最佳 F 值的特征。通过这种方式,选择的特征将充分代表所有类的属性。在流行的基准和具有挑战性的真实数据集上的实验结果表明了代价敏感特征选择在不平衡数据设置中的重要性,并验证了所提出方法的有效性。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验