Suppr超能文献

一种用于预测蛋白质结构类别的新型特征选择方法。

A novel feature selection method to predict protein structural class.

机构信息

Department of Automation, Xiamen University, Xiamen 361005, Fujian, China; School of Information Technology, York University, Toronto M3J 1P3, Canada.

School of Information Technology, York University, Toronto M3J 1P3, Canada.

出版信息

Comput Biol Chem. 2018 Oct;76:118-129. doi: 10.1016/j.compbiolchem.2018.06.007. Epub 2018 Jul 2.

Abstract

Integrating various features from different protein properties helps to improve the prediction accuracy of protein structural class but need to deal with the corresponding integrated high-dimensional data. Thus, the feature selection process used to select the informative features from the integrated features also becomes an indispensable key step. This paper proposes a novel feature selection method, Partial-Maximum-Correlation-Information based Recursive Feature Elimination (PMCI-RFE), to quickly select the best feature subset from the integrated high-dimensional protein features set to improve the prediction performance of protein structural class. PMCI-RFE can also be used to find different types of informative features to further analyze some biological relationships. The proposed PMCI-RFE method uses the correlation information between the feature space and class encoding space to select informative features based on the idea of orthogonal component projection in the feature space. The experimental results on six widely used benchmark datasets show that PMCI-RFE is a fast and effective method compare to other four state-of-the-art feature selection methods, which indeed can make full use of different protein property information and improve the predictability of protein structural class.

摘要

整合来自不同蛋白质特性的各种特征有助于提高蛋白质结构类别预测的准确性,但需要处理相应的集成高维数据。因此,用于从集成特征中选择信息特征的特征选择过程也成为一个不可或缺的关键步骤。本文提出了一种新的特征选择方法,基于偏最大相关信息的递归特征消除(PMCI-RFE),用于从集成的高维蛋白质特征集中快速选择最佳特征子集,以提高蛋白质结构类别预测的性能。PMCI-RFE 还可以用于寻找不同类型的信息特征,以进一步分析一些生物学关系。所提出的 PMCI-RFE 方法基于特征空间中正交分量投影的思想,利用特征空间和类别编码空间之间的相关信息来选择信息特征。在六个广泛使用的基准数据集上的实验结果表明,与其他四种最先进的特征选择方法相比,PMCI-RFE 是一种快速有效的方法,它确实可以充分利用不同的蛋白质特性信息,提高蛋白质结构类别的可预测性。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验