Suppr超能文献

一种基于对称不确定性和交互增益的新特征选择方法。

A new feature selection method based on symmetrical uncertainty and interaction gain.

机构信息

School of Computer Science & Technology, Dalian University of Technology, 116024, Dalian, China.

School of Computer Science & Technology, Dalian University of Technology, 116024, Dalian, China.

出版信息

Comput Biol Chem. 2019 Dec;83:107149. doi: 10.1016/j.compbiolchem.2019.107149. Epub 2019 Nov 6.

Abstract

Defining important information from complex biological data is of great significance in biological study. It is known that the physiological and pathological changes in an organism are usually influenced by molecule interactions. Analyzing biological data by fusing the evaluation of the individual molecules and molecule interactions could induce a more accurate and comprehensive understanding of the organism. This study proposes an Interaction Gain - Recursive Feature Elimination (IG-RFE) method which evaluates the feature importance by combining the relevance between feature and class label and the interaction among features. Symmetrical uncertainty is adopted to measure the relevance between feature and the class label. The average normalized interaction gain of feature f, every other features and the class label is calculated to reflect the interaction of feature f with other features in the feature set F. Based on the combination of symmetrical uncertainty and normalized interaction gain, less important features are removed iteratively. To show the performance of IG-RFE, it was compared with seven efficient feature selection methods, MIFS, mRMR, CMIM, ReliefF, FCBF, PGVNS and SVM-RFE, on eleven public datasets. The experiment results showed the superiority of IG-RFE in accuracy, sensitivity, specificity and stability. Hence, integrating feature individual discriminative ability and the interaction among features could better evaluate feature importance in biological data analysis.

摘要

从复杂的生物数据中提取重要信息对于生物研究具有重要意义。众所周知,生物体的生理和病理变化通常受到分子相互作用的影响。通过融合个体分子和分子相互作用的评估来分析生物数据,可以诱导出更准确和全面的生物体理解。本研究提出了一种交互增益-递归特征消除(IG-RFE)方法,该方法通过结合特征与类标签之间的相关性以及特征之间的交互作用来评估特征的重要性。采用对称不确定性来衡量特征与类标签之间的相关性。计算特征 f 与其他特征以及类标签的平均归一化交互增益,以反映特征 f 与特征集 F 中其他特征的相互作用。基于对称不确定性和归一化交互增益的组合,迭代地去除不太重要的特征。为了展示 IG-RFE 的性能,将其与七种高效特征选择方法(MIFS、mRMR、CMIM、ReliefF、FCBF、PGVNS 和 SVM-RFE)在十一个公共数据集上进行了比较。实验结果表明,IG-RFE 在准确性、敏感性、特异性和稳定性方面具有优越性。因此,在生物数据分析中整合特征个体判别能力和特征之间的相互作用可以更好地评估特征的重要性。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验