Suppr超能文献

相关性、冗余性和互补性权衡(RRCT):一种有原则的、通用的、强大的特征选择工具。

Relevance, redundancy, and complementarity trade-off (RRCT): A principled, generic, robust feature-selection tool.

作者信息

Tsanas Athanasios

机构信息

Usher Institute, Edinburgh Medical School, University of Edinburgh, NINE Edinburgh BioQuarter, 9 Little France road, Edinburgh, UK.

School of Mathematics, University of Edinburgh, Edinburgh, UK.

出版信息

Patterns (N Y). 2022 Mar 31;3(5):100471. doi: 10.1016/j.patter.2022.100471. eCollection 2022 May 13.

Abstract

We present a new heuristic feature-selection (FS) algorithm that integrates in a principled algorithmic framework the three key FS components: relevance, redundancy, and complementarity. Thus, we call it relevance, redundancy, and complementarity trade-off (RRCT). The association strength between each feature and the response and between feature pairs is quantified via an information theoretic transformation of rank correlation coefficients, and the feature complementarity is quantified using partial correlation coefficients. We empirically benchmark the performance of RRCT against 19 FS algorithms across four synthetic and eight real-world datasets in indicative challenging settings evaluating the following: (1) matching the true feature set and (2) out-of-sample performance in binary and multi-class classification problems when presenting selected features into a random forest. RRCT is very competitive in both tasks, and we tentatively make suggestions on the generalizability and application of the best-performing FS algorithms across settings where they may operate effectively.

摘要

我们提出了一种新的启发式特征选择(FS)算法,该算法在一个有原则的算法框架中集成了三个关键的FS组件:相关性、冗余性和互补性。因此,我们将其称为相关性、冗余性和互补性权衡(RRCT)。通过秩相关系数的信息理论变换来量化每个特征与响应之间以及特征对之间的关联强度,并使用偏相关系数来量化特征互补性。我们在四个合成数据集和八个真实世界数据集上,针对19种FS算法,在具有指示性挑战性的设置下对RRCT的性能进行了实证基准测试,评估以下内容:(1)匹配真实特征集;(2)在将选定特征输入随机森林时,在二分类和多分类问题中的样本外性能。RRCT在这两项任务中都极具竞争力,并且我们初步就性能最佳的FS算法在其可能有效运行的各种设置中的通用性和应用提出了建议。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f8cd/9122960/0f812033b3b8/fx1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验