Suppr超能文献

利用二级结构信息鉴定具有弱序列相似性的相关蛋白质。

Identification of related proteins with weak sequence identity using secondary structure information.

作者信息

Geourjon C, Combet C, Blanchet C, Deléage G

机构信息

Pôle BioInformatique Lyonnais, Institut de Biologie et Chimie des Protéines, Centre National de la Recherche Scientifique, UMR 5086, 69 367 Lyon CEDEX 07, France.

出版信息

Protein Sci. 2001 Apr;10(4):788-97. doi: 10.1110/ps.30001.

Abstract

Molecular modeling of proteins is confronted with the problem of finding homologous proteins, especially when few identities remain after the process of molecular evolution. Using even the most recent methods based on sequence identity detection, structural relationships are still difficult to establish with high reliability. As protein structures are more conserved than sequences, we investigated the possibility of using protein secondary structure comparison (observed or predicted structures) to discriminate between related and unrelated proteins sequences in the range of 10%-30% sequence identity. Pairwise comparison of secondary structures have been measured using the structural overlap (Sov) parameter. In this article, we show that if the secondary structures likeness is >50%, most of the pairs are structurally related. Taking into account the secondary structures of proteins that have been detected by BLAST, FASTA, or SSEARCH in the noisy region (with high E: value), we show that distantly related protein sequences (even with <20% identity) can be still identified. This strategy can be used to identify three-dimensional templates in homology modeling by finding unexpected related proteins and to select proteins for experimental investigation in a structural genomic approach, as well as for genome annotation.

摘要

蛋白质的分子建模面临着寻找同源蛋白质的问题,尤其是在分子进化过程后仅剩下很少的相同序列时。即使使用基于序列同一性检测的最新方法,结构关系仍然难以高度可靠地建立。由于蛋白质结构比序列更保守,我们研究了使用蛋白质二级结构比较(观察到的或预测的结构)来区分序列同一性在10%至30%范围内的相关和不相关蛋白质序列的可能性。二级结构的成对比较已使用结构重叠(Sov)参数进行测量。在本文中,我们表明,如果二级结构相似度>50%,大多数对在结构上是相关的。考虑到通过BLAST、FASTA或SSEARCH在噪声区域(具有高E值)中检测到的蛋白质的二级结构,我们表明远缘相关的蛋白质序列(即使同一性<20%)仍然可以被识别。该策略可用于通过找到意外的相关蛋白质来识别同源建模中的三维模板,以及在结构基因组学方法中选择用于实验研究的蛋白质,以及用于基因组注释。

相似文献

10
NdPASA: a pairwise sequence alignment server for distantly related proteins.NdPASA:用于远缘相关蛋白质的双序列比对服务器。
Bioinformatics. 2005 Oct 1;21(19):3803-5. doi: 10.1093/bioinformatics/bti619. Epub 2005 Aug 16.

引用本文的文献

1
Methods for discovering catalytic activities for pseudokinases.发现假激酶催化活性的方法。
Methods Enzymol. 2022;667:575-610. doi: 10.1016/bs.mie.2022.03.047. Epub 2022 Apr 18.
5
Structural modeling of the N-terminal signal-receiving domain of IκBα.IκBα N 端信号接收域的结构建模。
Front Mol Biosci. 2015 Jun 23;2:32. doi: 10.3389/fmolb.2015.00032. eCollection 2015.
7
From local structure to a global framework: recognition of protein folds.从局部结构到全局框架:蛋白质折叠的识别
J R Soc Interface. 2014 Apr 16;11(95):20131147. doi: 10.1098/rsif.2013.1147. Print 2014 Jun 6.

本文引用的文献

4
NPS@: network protein sequence analysis.NPS@:网络蛋白质序列分析。
Trends Biochem Sci. 2000 Mar;25(3):147-50. doi: 10.1016/s0968-0004(99)01540-6.
5
Benchmarking PSI-BLAST in genome annotation.在基因组注释中对PSI-BLAST进行基准测试。
J Mol Biol. 1999 Nov 12;293(5):1257-71. doi: 10.1006/jmbi.1999.3233.
6
Fold recognition using sequence and secondary structure information.利用序列和二级结构信息进行折叠识别。
Proteins. 1999;Suppl 3:141-8. doi: 10.1002/(sici)1097-0134(1999)37:3+<141::aid-prot19>3.3.co;2-6.
9
Advances in structural genomics.结构基因组学的进展。
Curr Opin Struct Biol. 1999 Jun;9(3):390-9. doi: 10.1016/S0959-440X(99)80053-0.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验