Suppr超能文献

挖掘用于蛋白质折叠识别的序列模式。

Mining sequential patterns for protein fold recognition.

作者信息

Exarchos Themis P, Papaloukas Costas, Lampros Christos, Fotiadis Dimitrios I

机构信息

Department of Medical Physics, Medical School, University of Ioannina, GR 45110 Ioannina, Greece.

出版信息

J Biomed Inform. 2008 Feb;41(1):165-79. doi: 10.1016/j.jbi.2007.05.004. Epub 2007 May 17.

Abstract

Protein data contain discriminative patterns that can be used in many beneficial applications if they are defined correctly. In this work sequential pattern mining (SPM) is utilized for sequence-based fold recognition. Protein classification in terms of fold recognition plays an important role in computational protein analysis, since it can contribute to the determination of the function of a protein whose structure is unknown. Specifically, one of the most efficient SPM algorithms, cSPADE, is employed for the analysis of protein sequence. A classifier uses the extracted sequential patterns to classify proteins in the appropriate fold category. For training and evaluating the proposed method we used the protein sequences from the Protein Data Bank and the annotation of the SCOP database. The method exhibited an overall accuracy of 25% in a classification problem with 36 candidate categories. The classification performance reaches up to 56% when the five most probable protein folds are considered.

摘要

蛋白质数据包含可用于许多有益应用的判别模式,前提是它们得到正确定义。在这项工作中,序列模式挖掘(SPM)被用于基于序列的折叠识别。基于折叠识别的蛋白质分类在计算蛋白质分析中起着重要作用,因为它有助于确定结构未知的蛋白质的功能。具体而言,最有效的SPM算法之一cSPADE被用于分析蛋白质序列。一个分类器使用提取的序列模式将蛋白质分类到适当的折叠类别中。为了训练和评估所提出的方法,我们使用了来自蛋白质数据库的蛋白质序列和SCOP数据库的注释。在一个有36个候选类别的分类问题中,该方法的总体准确率为25%。当考虑五个最可能的蛋白质折叠时,分类性能达到56%。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验