Clustering-based approach for predicting motif pairs from protein interaction data.

Suppr

超能文献

作者信息

Leung Henry Chi-Ming, Siu Man-Hung, Yiu Siu-Ming, Chin Francis Yuk-Lun, Sung Ken Wing-Kin

机构信息

Department of Computer Science, The University of Hong Kong, Pokfulam Road, Hong Kong, China.

出版信息

J Bioinform Comput Biol. 2009 Aug;7(4):701-16. doi: 10.1142/s0219720009004266.

DOI:10.1142/s0219720009004266

PMID:19634199

Abstract

UNLABELLED

Predicting motif pairs from a set of protein sequences based on the protein-protein interaction data is an important, but difficult computational problem. Tan et al. proposed a solution to this problem. However, the scoring function (using chi(2) testing) used in their approach is not adequate and their approach is also not scalable. It may take days to process a set of 5000 protein sequences with about 20,000 interactions. Later, Leung et al. proposed an improved scoring function and faster algorithms for solving the same problem. But, the model used in Leung et al. is complicated. The exact value of the scoring function is not easy to compute and an estimated value is used in practice. In this paper, we derive a better model to capture the significance of a given motif pair based on a clustering notion. We develop a fast heuristic algorithm to solve the problem. The algorithm is able to locate the correct motif pair in the yeast data set in about 45 minutes for 5000 protein sequences and 20,000 interactions. Moreover, we derive a lower bound result for the p-value of a motif pair in order for it to be distinguishable from random motif pairs. The lower bound result has been verified using simulated data sets.

AVAILABILITY

http://alse.cs.hku.hk/motif_pair.

摘要

相似文献

Clustering-based approach for predicting motif pairs from protein interaction data.

J Bioinform Comput Biol. 2009 Aug;7(4):701-16. doi: 10.1142/s0219720009004266.

Finding linear motif pairs from protein interaction networks: a probabilistic approach.从蛋白质相互作用网络中寻找线性基序对：一种概率方法。

Comput Syst Bioinformatics Conf. 2007;6:111-9.

Discovering motif pairs at interaction sites from protein sequences on a proteome-wide scale.在全蛋白质组范围内从蛋白质序列的相互作用位点发现基序对。

Bioinformatics. 2006 Apr 15;22(8):989-96. doi: 10.1093/bioinformatics/btl020. Epub 2006 Jan 29.

Discovery of stable and significant binding motif pairs from PDB complexes and protein interaction datasets.从蛋白质数据银行复合物和蛋白质相互作用数据集中发现稳定且显著的结合基序对。

Bioinformatics. 2005 Feb 1;21(3):314-24. doi: 10.1093/bioinformatics/bti019. Epub 2004 Sep 16.

BMC Bioinformatics. 2006 Nov 16;7:502. doi: 10.1186/1471-2105-7-502.

Learning to predict protein-protein interactions from protein sequences.学习从蛋白质序列预测蛋白质-蛋白质相互作用。

Bioinformatics. 2003 Oct 12;19(15):1875-81. doi: 10.1093/bioinformatics/btg352.

An integrated approach to the prediction of domain-domain interactions.一种预测结构域-结构域相互作用的综合方法。

BMC Bioinformatics. 2006 May 25;7:269. doi: 10.1186/1471-2105-7-269.

Predicting protein-peptide interactions via a network-based motif sampler.通过基于网络的基序采样器预测蛋白质-肽相互作用。

Bioinformatics. 2004 Aug 4;20 Suppl 1:i274-82. doi: 10.1093/bioinformatics/bth922.

ProClust: improved clustering of protein sequences with an extended graph-based approach.ProClust：基于扩展的图形方法改进蛋白质序列聚类

Bioinformatics. 2002;18 Suppl 2:S182-91. doi: 10.1093/bioinformatics/18.suppl_2.s182.

Finding motifs from all sequences with and without binding sites.从所有具有和不具有结合位点的序列中寻找基序。

Bioinformatics. 2006 Sep 15;22(18):2217-23. doi: 10.1093/bioinformatics/btl371. Epub 2006 Jul 26.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验