Suppr超能文献

通过蛋白质之间的相互作用和序列相似性来检测远缘相关蛋白质。

Detecting remotely related proteins by their interactions and sequence similarity.

作者信息

Espadaler Jordi, Aragüés Ramón, Eswar Narayanan, Marti-Renom Marc A, Querol Enrique, Avilés Francesc X, Sali Andrej, Oliva Baldomero

机构信息

Laboratori de Bioinformàtica Estructural, Grup de Recerca en Informàtica Biomèdica-Institut Municipal d'Investigació Médica (GRIB-IMIM), Departament de Ciències Experimentals i de la Salut, Universitat Pompeu Fabra, 08003 Barcelona, Catalonia, Spain.

出版信息

Proc Natl Acad Sci U S A. 2005 May 17;102(20):7151-6. doi: 10.1073/pnas.0500831102. Epub 2005 May 9.

Abstract

The function of an uncharacterized protein is usually inferred either from its homology to, or its interactions with, characterized proteins. Here, we use both sequence similarity and protein interactions to identify relationships between remotely related protein sequences. We rely on the fact that homologous sequences share similar interactions, and, therefore, the set of interacting partners of the partners of a given protein is enriched by its homologs. The approach was bench-marked by assigning the fold and functional family to test sequences of known structure. Specifically, we relied on 1,434 proteins with known folds, as defined in the Structural Classification of Proteins (SCOP) database, and with known interacting partners, as defined in the Database of Interacting Proteins (DIP). For this subset, the specificity of fold assignment was increased from 54% for position-specific iterative BLAST to 75% for our approach, with a concomitant increase in sensitivity for a few percentage points. Similarly, the specificity of family assignment at the e-value threshold of 10(-8) was increased from 70% to 87%. The proposed method would be a useful tool for large-scale automated discovery of remote relationships between protein sequences, given its unique reliance on sequence similarity and protein-protein interactions.

摘要

通常通过与已明确特征的蛋白质的同源性或相互作用来推断未知蛋白质的功能。在此,我们利用序列相似性和蛋白质相互作用来识别远缘相关蛋白质序列之间的关系。我们依据的事实是,同源序列具有相似的相互作用,因此,给定蛋白质的相互作用伙伴的伙伴集合会因与其同源的蛋白质而得到富集。通过为已知结构的测试序列指定折叠类型和功能家族来对该方法进行基准测试。具体而言,我们依据蛋白质结构分类(SCOP)数据库中定义的具有已知折叠类型以及相互作用蛋白质数据库(DIP)中定义的具有已知相互作用伙伴的1434种蛋白质。对于这个子集,折叠类型指定的特异性从位置特异性迭代BLAST的54%提高到了我们方法的75%,同时敏感性也提高了几个百分点。同样,在e值阈值为10^(-8)时,家族指定特异性从70%提高到了87%。鉴于该方法独特地依赖于序列相似性和蛋白质 - 蛋白质相互作用,所提出的方法将成为大规模自动发现蛋白质序列之间远缘关系的有用工具。

相似文献

1
Detecting remotely related proteins by their interactions and sequence similarity.
Proc Natl Acad Sci U S A. 2005 May 17;102(20):7151-6. doi: 10.1073/pnas.0500831102. Epub 2005 May 9.
2
Identification of homology in protein structure classification.
Nat Struct Biol. 2001 Nov;8(11):953-7. doi: 10.1038/nsb1101-953.
3
Automatic classification of protein structures using low-dimensional structure space mappings.
BMC Bioinformatics. 2014;15 Suppl 2(Suppl 2):S1. doi: 10.1186/1471-2105-15-S2-S1. Epub 2014 Jan 24.
5
8
A comparison of sequence and structure protein domain families as a basis for structural genomics.
Bioinformatics. 1999 Jun;15(6):480-500. doi: 10.1093/bioinformatics/15.6.480.
9
Structural classification of thioredoxin-like fold proteins.
Proteins. 2005 Feb 1;58(2):376-88. doi: 10.1002/prot.20329.
10
Assessment of a rigorous transitive profile based search method to detect remotely similar proteins.
J Biomol Struct Dyn. 2005 Dec;23(3):283-98. doi: 10.1080/07391102.2005.10507066.

引用本文的文献

1
Phosphatase POPX2 interferes with cell cycle by interacting with Chk1.
Cell Cycle. 2020 Feb;19(4):405-418. doi: 10.1080/15384101.2020.1711577. Epub 2020 Jan 16.
2
Simplified method to predict mutual interactions of human transcription factors based on their primary structure.
PLoS One. 2011;6(7):e21887. doi: 10.1371/journal.pone.0021887. Epub 2011 Jul 5.
3
4
ModLink+: improving fold recognition by using protein-protein interactions.
Bioinformatics. 2009 Jun 15;25(12):1506-12. doi: 10.1093/bioinformatics/btp238. Epub 2009 Apr 8.
5
Prediction of enzyme function by combining sequence similarity and protein interactions.
BMC Bioinformatics. 2008 May 27;9:249. doi: 10.1186/1471-2105-9-249.
6
Protein networks in disease.
Genome Res. 2008 Apr;18(4):644-52. doi: 10.1101/gr.071852.107.
8
Characterization of protein hubs by inferring interacting motifs from protein interactions.
PLoS Comput Biol. 2007 Sep;3(9):1761-71. doi: 10.1371/journal.pcbi.0030178. Epub 2007 Jul 30.
9
Systematic identification of functional orthologs based on protein network comparison.
Genome Res. 2006 Mar;16(3):428-35. doi: 10.1101/gr.4526006.
10
Comparative modelling of protein structure and its impact on microbial cell factories.
Microb Cell Fact. 2005 Jun 30;4:20. doi: 10.1186/1475-2859-4-20.

本文引用的文献

1
A probabilistic view of gene function.
Nat Genet. 2004 Jun;36(6):559-64. doi: 10.1038/ng1370.
2
Structural characterization of genomes by large scale sequence-structure threading.
BMC Bioinformatics. 2004 Apr 3;5:37. doi: 10.1186/1471-2105-5-37.
3
Alignment of protein sequences by their profiles.
Protein Sci. 2004 Apr;13(4):1071-87. doi: 10.1110/ps.03379804.
5
Detection of homologous proteins by an intermediate sequence search.
Protein Sci. 2004 Jan;13(1):54-62. doi: 10.1110/ps.03335004.
6
The Database of Interacting Proteins: 2004 update.
Nucleic Acids Res. 2004 Jan 1;32(Database issue):D449-51. doi: 10.1093/nar/gkh086.
7
SCOP database in 2004: refinements integrate structure and sequence family data.
Nucleic Acids Res. 2004 Jan 1;32(Database issue):D226-9. doi: 10.1093/nar/gkh039.
8
MODBASE, a database of annotated comparative protein structure models, and associated resources.
Nucleic Acids Res. 2004 Jan 1;32(Database issue):D217-22. doi: 10.1093/nar/gkh095.
9
SMART 4.0: towards genomic data integration.
Nucleic Acids Res. 2004 Jan 1;32(Database issue):D142-4. doi: 10.1093/nar/gkh088.
10
The Pfam protein families database.
Nucleic Acids Res. 2004 Jan 1;32(Database issue):D138-41. doi: 10.1093/nar/gkh121.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验