Suppr超能文献

一种基于进化特征的新型融合方法,用于使用支持向量机进行蛋白质折叠识别。

A novel fusion based on the evolutionary features for protein fold recognition using support vector machines.

机构信息

Department of Electrical Engineering, Amirkabir University of Technology, Tehran, Iran.

Iranian Research Institute for Information Science and Technology (IranDoc), Tehran, Iran.

出版信息

Sci Rep. 2020 Sep 1;10(1):14368. doi: 10.1038/s41598-020-71172-x.

Abstract

Protein fold recognition plays a crucial role in discovering three-dimensional structure of proteins and protein functions. Several approaches have been employed for the prediction of protein folds. Some of these approaches are based on extracting features from protein sequences and using a strong classifier. Feature extraction techniques generally utilize syntactical-based information, evolutionary-based information and physicochemical-based information to extract features. In recent years, finding an efficient technique for integrating discriminate features have been received advancing attention. In this study, we integrate Auto-Cross-Covariance and Separated dimer evolutionary feature extraction methods. The results' features are scored by Information gain to define and select several discriminated features. According to three benchmark datasets, DD, RDD ,and EDD, the results of the support vector machine show more than 6[Formula: see text] improvement in accuracy on these benchmark datasets.

摘要

蛋白质折叠识别在发现蛋白质的三维结构和蛋白质功能方面起着至关重要的作用。已经采用了几种方法来预测蛋白质折叠。其中一些方法基于从蛋白质序列中提取特征,并使用强分类器。特征提取技术通常利用基于语法的信息、基于进化的信息和基于物理化学的信息来提取特征。近年来,寻找一种有效的技术来集成判别特征已引起人们的关注。在这项研究中,我们整合了自交叉协方差和分离二聚体进化特征提取方法。使用信息增益对结果的特征进行评分,以定义和选择几个有区别的特征。根据三个基准数据集 DD、RDD 和 EDD,支持向量机的结果在这些基准数据集上的准确率提高了 6[Formula: see text]以上。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b93e/7463267/4628d4537931/41598_2020_71172_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验