Li Haiquan, Li Jinyan
Institute for Infocomm Research, 21 Heng Mui Keng Terrace, Singapore 119613.
Bioinformatics. 2005 Feb 1;21(3):314-24. doi: 10.1093/bioinformatics/bti019. Epub 2004 Sep 16.
Discovery of binding sites is important in the study of protein-protein interactions. In this paper, we introduce stable and significant motif pairs to model protein-binding sites. The stability is the pattern's resistance to some transformation. The significance is the unexpected frequency of occurrence of the pattern in a sequence dataset comprising known interacting protein pairs. Discovery of stable motif pairs is an iterative process, undergoing a chain of changing but converging patterns. Determining the starting point for such a chain is an interesting problem. We use a protein complex dataset extracted from the Protein Data Bank to help in identifying those starting points, so that the computational complexity of the problem is much released.
We found 913 stable motif pairs, of which 765 are significant. We evaluated these motif pairs using comprehensive comparison results against random patterns. Wet-experimentally discovered motifs reported in the literature were also used to confirm the effectiveness of our method.
在蛋白质-蛋白质相互作用的研究中,结合位点的发现至关重要。在本文中,我们引入稳定且显著的基序对来模拟蛋白质结合位点。稳定性是指模式对某种变换的抗性。显著性是指该模式在包含已知相互作用蛋白对的序列数据集中出现的意外频率。稳定基序对的发现是一个迭代过程,会经历一系列不断变化但趋于收敛的模式。确定这样一个序列的起始点是一个有趣的问题。我们使用从蛋白质数据库中提取的蛋白质复合物数据集来帮助识别这些起始点,从而大大降低了问题的计算复杂度。
我们发现了913个稳定基序对,其中765个是显著的。我们使用与随机模式的综合比较结果对这些基序对进行了评估。文献中报道的通过湿实验发现的基序也被用来证实我们方法的有效性。