Suppr超能文献

使用递归特征选择和随机森林提高低相似度序列的蛋白质结构分类预测。

Using Recursive Feature Selection with Random Forest to Improve Protein Structural Class Prediction for Low-Similarity Sequences.

机构信息

College of Life Sciences, Zhejiang Sci-Tech University, Hangzhou 310018, China.

Qixin School, Zhejiang Sci-Tech University, Hangzhou 310018, China.

出版信息

Comput Math Methods Med. 2021 May 7;2021:5529389. doi: 10.1155/2021/5529389. eCollection 2021.

Abstract

Many combinations of protein features are used to improve protein structural class prediction, but the information redundancy is often ignored. In order to select the important features with strong classification ability, we proposed a recursive feature selection with random forest to improve protein structural class prediction. We evaluated the proposed method with four experiments and compared it with the available competing prediction methods. The results indicate that the proposed feature selection method effectively improves the efficiency of protein structural class prediction. Only less than 5% features are used, but the prediction accuracy is improved by 4.6-13.3%. We further compared different protein features and found that the predicted secondary structural features achieve the best performance. This understanding can be used to design more powerful prediction methods for the protein structural class.

摘要

许多蛋白质特征的组合被用于改进蛋白质结构类别的预测,但信息冗余问题往往被忽略。为了选择具有强分类能力的重要特征,我们提出了一种基于随机森林的递归特征选择方法,以改进蛋白质结构类别的预测。我们用四个实验来评估所提出的方法,并与现有的竞争预测方法进行了比较。结果表明,所提出的特征选择方法有效地提高了蛋白质结构类别的预测效率。仅使用不到 5%的特征,预测精度就提高了 4.6-13.3%。我们进一步比较了不同的蛋白质特征,发现预测的二级结构特征具有最佳的性能。这种理解可以用于设计更强大的蛋白质结构类别的预测方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ed2f/8123985/492be287faec/CMMM2021-5529389.001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验