从蛋白质-蛋白质相互作用数据中发现蛋白质序列特征。

Discover protein sequence signatures from protein-protein interaction data.

作者信息

Fang Jianwen, Haasl Ryan J, Dong Yinghua, Lushington Gerald H

机构信息

Bioinformatics Core Facility, University of Kansas, Lawrence, KS 66045, USA.

出版信息

BMC Bioinformatics. 2005 Nov 23;6:277. doi: 10.1186/1471-2105-6-277.

DOI:10.1186/1471-2105-6-277

PMID:16305745

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC1310605/

Abstract

BACKGROUND

The development of high-throughput technologies such as yeast two-hybrid systems and mass spectrometry technologies has made it possible to generate large protein-protein interaction (PPI) datasets. Mining these datasets for underlying biological knowledge has, however, remained a challenge.

RESULTS

A total of 3108 sequence signatures were found, each of which was shared by a set of guest proteins interacting with one of 944 host proteins in Saccharomyces cerevisiae genome. Approximately 94% of these sequence signatures matched entries in InterPro member databases. We identified 84 distinct sequence signatures from the remaining 172 unknown signatures. The signature sharing information was then applied in predicting sub-cellular localization of yeast proteins and the novel signatures were used in identifying possible interacting sites.

CONCLUSION

We reported a method of PPI data mining that facilitated the discovery of novel sequence signatures using a large PPI dataset from S. cerevisiae genome as input. The fact that 94% of discovered signatures were known validated the ability of the approach to identify large numbers of signatures from PPI data. The significance of these discovered signatures was demonstrated by their application in predicting sub-cellular localizations and identifying potential interaction binding sites of yeast proteins.

摘要

背景

诸如酵母双杂交系统和质谱技术等高通量技术的发展使得生成大量蛋白质-蛋白质相互作用（PPI）数据集成为可能。然而，从这些数据集中挖掘潜在的生物学知识仍然是一项挑战。

结果

总共发现了3108个序列特征，每个特征都由一组与酿酒酵母基因组中944个宿主蛋白之一相互作用的客体蛋白共享。这些序列特征中约94%与InterPro成员数据库中的条目匹配。我们从其余172个未知特征中识别出84个不同的序列特征。然后，特征共享信息被应用于预测酵母蛋白的亚细胞定位，新特征被用于识别可能的相互作用位点。

结论

我们报告了一种PPI数据挖掘方法，该方法以酿酒酵母基因组的大型PPI数据集为输入，促进了新序列特征的发现。94%的发现特征是已知的这一事实验证了该方法从PPI数据中识别大量特征的能力。这些发现特征的重要性通过它们在预测酵母蛋白的亚细胞定位和识别潜在相互作用结合位点中的应用得到了证明。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0b65/1310605/350c62cde8d9/1471-2105-6-277-1.jpg

相似文献

Discover protein sequence signatures from protein-protein interaction data.

BMC Bioinformatics. 2005 Nov 23;6:277. doi: 10.1186/1471-2105-6-277.

Discovering motif pairs at interaction sites from protein sequences on a proteome-wide scale.

Bioinformatics. 2006 Apr 15;22(8):989-96. doi: 10.1093/bioinformatics/btl020. Epub 2006 Jan 29.

J Mol Biol. 2001 Aug 24;311(4):681-92. doi: 10.1006/jmbi.2001.4920.

Domain-based small molecule binding site annotation.

BMC Bioinformatics. 2006 Mar 17;7:152. doi: 10.1186/1471-2105-7-152.

Conserved network motifs allow protein-protein interaction prediction.

Bioinformatics. 2004 Dec 12;20(18):3346-52. doi: 10.1093/bioinformatics/bth402. Epub 2004 Jul 9.

Identifying cooperative transcription factors in yeast using multiple data sources.

BMC Syst Biol. 2014;8 Suppl 5(Suppl 5):S2. doi: 10.1186/1752-0509-8-S5-S2. Epub 2014 Dec 12.

HPID: the Human Protein Interaction Database.

Bioinformatics. 2004 Oct 12;20(15):2466-70. doi: 10.1093/bioinformatics/bth253. Epub 2004 Apr 29.

A statistical framework for combining and interpreting proteomic datasets.

Bioinformatics. 2004 Mar 22;20(5):689-700. doi: 10.1093/bioinformatics/btg469. Epub 2004 Jan 22.

Structure-templated predictions of novel protein interactions from sequence information.

PLoS Comput Biol. 2007 Sep;3(9):1783-9. doi: 10.1371/journal.pcbi.0030182.

High-throughput identification of interacting protein-protein binding sites.

BMC Bioinformatics. 2007 Jun 27;8:223. doi: 10.1186/1471-2105-8-223.

引用本文的文献

RefSelect: a reference sequence selection algorithm for planted (l, d) motif search.

BMC Bioinformatics. 2016 Jul 19;17 Suppl 9(Suppl 9):266. doi: 10.1186/s12859-016-1130-6.

Ab initio coordination chemistry for nickel chelation motifs.

PLoS One. 2015 May 18;10(5):e0126787. doi: 10.1371/journal.pone.0126787. eCollection 2015.

Large scale in silico identification of MYB family genes from wheat expressed sequence tags.

Mol Biotechnol. 2012 Oct;52(2):184-92. doi: 10.1007/s12033-011-9486-3.

Discriminative motif discovery in DNA and protein sequences using the DEME algorithm.

BMC Bioinformatics. 2007 Oct 15;8:385. doi: 10.1186/1471-2105-8-385.

MEME: discovering and analyzing DNA and protein sequence motifs.

Nucleic Acids Res. 2006 Jul 1;34(Web Server issue):W369-73. doi: 10.1093/nar/gkl198.

本文引用的文献

Enhanced statistics for local alignment of multiple alignments improves prediction of protein function and structure.

Bioinformatics. 2005 Jul 1;21(13):2950-6. doi: 10.1093/bioinformatics/bti462. Epub 2005 May 3.

Quasi-consensus-based comparison of profile hidden Markov models for protein sequences.

Bioinformatics. 2005 May 15;21(10):2287-93. doi: 10.1093/bioinformatics/bti374. Epub 2005 Mar 29.

The evolution of domain arrangements in proteins and interaction networks.

Cell Mol Life Sci. 2005 Feb;62(4):435-45. doi: 10.1007/s00018-004-4416-1.

InterPro, progress and status in 2005.

Nucleic Acids Res. 2005 Jan 1;33(Database issue):D201-5. doi: 10.1093/nar/gki106.

PreSPI: a domain combination based prediction system for protein-protein interaction.

Nucleic Acids Res. 2004 Dec 1;32(21):6312-20. doi: 10.1093/nar/gkh972. Print 2004.

The MIPS mammalian protein-protein interaction database.

Bioinformatics. 2005 Mar;21(6):832-4. doi: 10.1093/bioinformatics/bti115. Epub 2004 Nov 5.

Predicting protein localization in budding yeast.

Bioinformatics. 2005 Apr 1;21(7):944-50. doi: 10.1093/bioinformatics/bti104. Epub 2004 Oct 28.

Discovery of stable and significant binding motif pairs from PDB complexes and protein interaction datasets.

Bioinformatics. 2005 Feb 1;21(3):314-24. doi: 10.1093/bioinformatics/bti019. Epub 2004 Sep 16.

Predicting 22 protein localizations in budding yeast.

Biochem Biophys Res Commun. 2004 Oct 15;323(2):425-8. doi: 10.1016/j.bbrc.2004.08.113.

ESLpred: SVM-based method for subcellular localization of eukaryotic proteins using dipeptide composition and PSI-BLAST.

Nucleic Acids Res. 2004 Jul 1;32(Web Server issue):W414-9. doi: 10.1093/nar/gkh350.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

从蛋白质-蛋白质相互作用数据中发现蛋白质序列特征。

Discover protein sequence signatures from protein-protein interaction data.

作者信息

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSION

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献