用于特征加权的迭代RELIEF：算法、理论与应用

Iterative RELIEF for feature weighting: algorithms, theories, and applications.

作者信息

Sun Yijun

机构信息

Interdisciplinary Center for Biotechnology Research, University of Florida, Gainesville, FL 32610, USA.

出版信息

IEEE Trans Pattern Anal Mach Intell. 2007 Jun;29(6):1035-51. doi: 10.1109/TPAMI.2007.1093.

DOI:10.1109/TPAMI.2007.1093

PMID:17431301

Abstract

RELIEF is considered one of the most successful algorithms for assessing the quality of features. In this paper, we propose a set of new feature weighting algorithms that perform significantly better than RELIEF, without introducing a large increase in computational complexity. Our work starts from a mathematical interpretation of the seemingly heuristic RELIEF algorithm as an online method solving a convex optimization problem with a margin-based objective function. This interpretation explains the success of RELIEF in real application and enables us to identify and address its following weaknesses. RELIEF makes an implicit assumption that the nearest neighbors found in the original feature space are the ones in the weighted space and RELIEF lacks a mechanism to deal with outlier data. We propose an iterative RELIEF (I-RELIEF) algorithm to alleviate the deficiencies of RELIEF by exploring the framework of the Expectation-Maximization algorithm. We extend I-RELIEF to multiclass settings by using a new multiclass margin definition. To reduce computational costs, an online learning algorithm is also developed. Convergence analysis of the proposed algorithms is presented. The results of large-scale experiments on the UCI and microarray data sets are reported, which demonstrate the effectiveness of the proposed algorithms, and verify the presented theoretical results.

摘要

RELIEF被认为是评估特征质量最成功的算法之一。在本文中，我们提出了一组新的特征加权算法，这些算法的性能明显优于RELIEF，同时不会大幅增加计算复杂度。我们的工作始于对看似启发式的RELIEF算法的数学解释，将其视为一种在线方法，用于解决具有基于边际目标函数的凸优化问题。这种解释说明了RELIEF在实际应用中的成功之处，并使我们能够识别并解决其以下弱点。RELIEF隐含地假设在原始特征空间中找到的最近邻也是加权空间中的最近邻，并且RELIEF缺乏处理异常数据的机制。我们提出了一种迭代RELIEF（I-RELIEF）算法，通过探索期望最大化算法的框架来缓解RELIEF的不足。我们通过使用新的多类边际定义将I-RELIEF扩展到多类设置。为了降低计算成本，还开发了一种在线学习算法。给出了所提出算法的收敛性分析。报告了在UCI和微阵列数据集上的大规模实验结果，这些结果证明了所提出算法的有效性，并验证了所给出的理论结果。

相似文献

Iterative RELIEF for feature weighting: algorithms, theories, and applications.

IEEE Trans Pattern Anal Mach Intell. 2007 Jun;29(6):1035-51. doi: 10.1109/TPAMI.2007.1093.

Quasiconvex optimization for robust geometric reconstruction.

IEEE Trans Pattern Anal Mach Intell. 2007 Oct;29(10):1834-47. doi: 10.1109/TPAMI.2007.1083.

Simultaneous feature selection and clustering using mixture models.

IEEE Trans Pattern Anal Mach Intell. 2004 Sep;26(9):1154-66. doi: 10.1109/TPAMI.2004.71.

An EM algorithm for shape classification based on level sets.

Med Image Anal. 2005 Oct;9(5):491-502. doi: 10.1016/j.media.2005.05.001.

Effective feature extraction in high-dimensional space.

IEEE Trans Syst Man Cybern B Cybern. 2008 Dec;38(6):1652-6. doi: 10.1109/TSMCB.2008.927276.

An efficient Earth Mover's Distance algorithm for robust histogram comparison.

IEEE Trans Pattern Anal Mach Intell. 2007 May;29(5):840-53. doi: 10.1109/TPAMI.2007.1058.

Curve/surface representation and evolution using vector level sets with application to the shape-based segmentation problem.

IEEE Trans Pattern Anal Mach Intell. 2007 Jun;29(6):945-58. doi: 10.1109/TPAMI.2007.1100.

A shape-from-shading method of polyhedral objects using prior information.

IEEE Trans Pattern Anal Mach Intell. 2006 Apr;28(4):612-24. doi: 10.1109/TPAMI.2006.67.

Riemannian manifold learning.

IEEE Trans Pattern Anal Mach Intell. 2008 May;30(5):796-809. doi: 10.1109/TPAMI.2007.70735.

Fast 3D iterative image reconstruction for SPECT with rotating slat collimators.

Phys Med Biol. 2009 Feb 7;54(3):715-29. doi: 10.1088/0031-9155/54/3/016. Epub 2009 Jan 9.

引用本文的文献

Automatic detection of Alzheimer's disease from EEG signals using an improved AFS-GA hybrid algorithm.

Cogn Neurodyn. 2024 Oct;18(5):2993-3013. doi: 10.1007/s11571-024-10130-z. Epub 2024 Jun 10.

Assessing the limitations of relief-based algorithms in detecting higher-order interactions.

BioData Min. 2024 Oct 1;17(1):37. doi: 10.1186/s13040-024-00390-0.

Assessing the Limitations of Relief-Based Algorithms in Detecting Higher-Order Interactions.

Res Sq. 2024 Sep 2:rs.3.rs-4870116. doi: 10.21203/rs.3.rs-4870116/v1.

Improving mammography lesion classification by optimal fusion of handcrafted and deep transfer learning features.

Phys Med Biol. 2022 Feb 21;67(5). doi: 10.1088/1361-6560/ac5297.

Systems biology and machine learning approaches identify drug targets in diabetic nephropathy.

Sci Rep. 2021 Dec 6;11(1):23452. doi: 10.1038/s41598-021-02282-3.

A novel systematic approach for cancer treatment prognosis and its applications in oropharyngeal cancer with microRNA biomarkers.

Bioinformatics. 2021 Oct 11;37(19):3106-3114. doi: 10.1093/bioinformatics/btab242.

A Machine Learning Approach to Monitor the Emergence of Late Intrauterine Growth Restriction.

Front Artif Intell. 2021 Mar 8;4:622616. doi: 10.3389/frai.2021.622616. eCollection 2021.

Implementing a high-efficiency similarity analysis approach for firmware code.

PLoS One. 2021 Jan 12;16(1):e0245098. doi: 10.1371/journal.pone.0245098. eCollection 2021.

Detecting biomarkers from microarray data using distributed correlation based gene selection.

Genes Genomics. 2020 Apr;42(4):449-465. doi: 10.1007/s13258-020-00916-w. Epub 2020 Feb 10.

Multi-group diagnostic classification of high-dimensional data using differential scanning calorimetry plasma thermograms.

PLoS One. 2019 Aug 20;14(8):e0220765. doi: 10.1371/journal.pone.0220765. eCollection 2019.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

用于特征加权的迭代RELIEF：算法、理论与应用

Iterative RELIEF for feature weighting: algorithms, theories, and applications.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献