Suppr超能文献

从蛋白质序列中提取特征以改进用于蛋白质折叠识别的深度极限学习机。

Extracting features from protein sequences to improve deep extreme learning machine for protein fold recognition.

作者信息

Ibrahim Wisam, Abadeh Mohammad Saniee

机构信息

Faculty of Electrical and Computer Engineering, Tarbiat Modares University, Tehran, Iran.

Faculty of Electrical and Computer Engineering, Tarbiat Modares University, Tehran, Iran.

出版信息

J Theor Biol. 2017 May 21;421:1-15. doi: 10.1016/j.jtbi.2017.03.023. Epub 2017 Mar 27.

Abstract

Protein fold recognition is an important problem in bioinformatics to predict three-dimensional structure of a protein. One of the most challenging tasks in protein fold recognition problem is the extraction of efficient features from the amino-acid sequences to obtain better classifiers. In this paper, we have proposed six descriptors to extract features from protein sequences. These descriptors are applied in the first stage of a three-stage framework PCA-DELM-LDA to extract feature vectors from the amino-acid sequences. Principal Component Analysis PCA has been implemented to reduce the number of extracted features. The extracted feature vectors have been used with original features to improve the performance of the Deep Extreme Learning Machine DELM in the second stage. Four new features have been extracted from the second stage and used in the third stage by Linear Discriminant Analysis LDA to classify the instances into 27 folds. The proposed framework is implemented on the independent and combined feature sets in SCOP datasets. The experimental results show that extracted feature vectors in the first stage could improve the performance of DELM in extracting new useful features in second stage.

摘要

蛋白质折叠识别是生物信息学中预测蛋白质三维结构的一个重要问题。蛋白质折叠识别问题中最具挑战性的任务之一是从氨基酸序列中提取有效特征以获得更好的分类器。在本文中,我们提出了六种描述符来从蛋白质序列中提取特征。这些描述符应用于三阶段框架PCA - DELM - LDA的第一阶段,以从氨基酸序列中提取特征向量。主成分分析(PCA)已被用于减少提取特征的数量。提取的特征向量与原始特征一起用于在第二阶段提高深度极限学习机(DELM)的性能。在第二阶段提取了四个新特征,并在第三阶段通过线性判别分析(LDA)将实例分类为27个折叠。所提出的框架在SCOP数据集中的独立和组合特征集上实现。实验结果表明,第一阶段提取的特征向量可以提高DELM在第二阶段提取新的有用特征的性能。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验