Suppr超能文献

通过多序列比对熵谱进行蛋白质同源性检测和折叠推断。

Protein homology detection and fold inference through multiple alignment entropy profiles.

作者信息

Sánchez-Flores Alejandro, Pérez-Rueda Ernesto, Segovia Lorenzo

机构信息

Departamento de Ingeniería Celular y Biocatálisis, Instituto de Biotecnología, Universidad Nacional Autónoma de México.

出版信息

Proteins. 2008 Jan 1;70(1):248-56. doi: 10.1002/prot.21506.

Abstract

Homology detection and protein structure prediction are central themes in bioinformatics. Establishment of relationship between protein sequences or prediction of their structure by sequence comparison methods finds limitations when there is low sequence similarity. Recent works demonstrate that the use of profiles improves homology detection and protein structure prediction. Profiles can be inferred from protein multiple alignments using different approaches. The "Conservatism-of-Conservatism" is an effective profile analysis method to identify structural features between proteins having the same fold but no detectable sequence similarity. The information obtained from protein multiple alignments varies according to the amino acid classification employed to calculate the profile. In this work, we calculated entropy profiles from PSI-BLAST-derived multiple alignments and used different amino acid classifications summarizing almost 500 different attributes. These entropy profiles were converted into pseudocodes which were compared using the FASTA program with an ad-hoc matrix. We tested the performance of our method to identify relationships between proteins with similar fold using a nonredundant subset of sequences having less than 40% of identity. We then compared our results using Coverage Versus Error per query curves, to those obtained by methods like PSI-BLAST, COMPASS and HHSEARCH. Our method, named HIP (Homology Identification with Profiles) presented higher accuracy detecting relationships between proteins with the same fold. The use of different amino acid classifications reflecting a large number of amino acid attributes, improved the recognition of distantly related folds. We propose the use of pseudocodes representing profile information as a fast and powerful tool for homology detection, fold assignment and analysis of evolutionary information enclosed in protein profiles.

摘要

同源性检测和蛋白质结构预测是生物信息学的核心主题。当序列相似性较低时,通过序列比较方法建立蛋白质序列之间的关系或预测其结构会受到限制。最近的研究表明,使用轮廓可以提高同源性检测和蛋白质结构预测的准确性。轮廓可以通过不同的方法从蛋白质多序列比对中推断出来。“保守性之保守性”是一种有效的轮廓分析方法,用于识别具有相同折叠但无可检测序列相似性的蛋白质之间的结构特征。从蛋白质多序列比对中获得的信息会根据用于计算轮廓的氨基酸分类而有所不同。在这项工作中,我们从PSI-BLAST衍生的多序列比对中计算了熵轮廓,并使用了总结近500种不同属性的不同氨基酸分类。这些熵轮廓被转换为伪代码,然后使用FASTA程序和一个临时矩阵进行比较。我们使用同一性低于40%的非冗余序列子集测试了我们的方法识别具有相似折叠的蛋白质之间关系的性能。然后,我们使用每个查询的覆盖率与错误率曲线,将我们的结果与PSI-BLAST、COMPASS和HHSEARCH等方法获得的结果进行比较。我们的方法名为HIP(基于轮廓的同源性识别),在检测具有相同折叠的蛋白质之间的关系时表现出更高的准确性。使用反映大量氨基酸属性的不同氨基酸分类,提高了对远缘相关折叠的识别。我们建议将表示轮廓信息的伪代码用作同源性检测、折叠分配和分析蛋白质轮廓中包含的进化信息的快速而强大的工具。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验