Burger Lukas, van Nimwegen Erik
Biozentrum, the University of Basel, and Swiss Institute of Bioinformatics, Basel, Switzerland.
Mol Syst Biol. 2008;4:165. doi: 10.1038/msb4100203. Epub 2008 Feb 12.
Accurate and large-scale prediction of protein-protein interactions directly from amino-acid sequences is one of the great challenges in computational biology. Here we present a new Bayesian network method that predicts interaction partners using only multiple alignments of amino-acid sequences of interacting protein domains, without tunable parameters, and without the need for any training examples. We first apply the method to bacterial two-component systems and comprehensively reconstruct two-component signaling networks across all sequenced bacteria. Comparisons of our predictions with known interactions show that our method infers interaction partners genome-wide with high accuracy. To demonstrate the general applicability of our method we show that it also accurately predicts interaction partners in a recent dataset of polyketide synthases. Analysis of the predicted genome-wide two-component signaling networks shows that cognates (interacting kinase/regulator pairs, which lie adjacent on the genome) and orphans (which lie isolated) form two relatively independent components of the signaling network in each genome. In addition, while most genes are predicted to have only a small number of interaction partners, we find that 10% of orphans form a separate class of 'hub' nodes that distribute and integrate signals to and from up to tens of different interaction partners.
直接从氨基酸序列准确且大规模地预测蛋白质-蛋白质相互作用是计算生物学中的重大挑战之一。在此,我们提出一种新的贝叶斯网络方法,该方法仅使用相互作用蛋白结构域的氨基酸序列的多序列比对来预测相互作用伙伴,无需可调参数,也无需任何训练示例。我们首先将该方法应用于细菌双组分系统,并全面重建了所有已测序细菌的双组分信号网络。将我们的预测结果与已知相互作用进行比较表明,我们的方法能够在全基因组范围内高精度地推断相互作用伙伴。为了证明我们方法的普遍适用性,我们表明它也能准确预测聚酮合酶最近数据集中的相互作用伙伴。对预测的全基因组双组分信号网络的分析表明,同源物(在基因组上相邻的相互作用激酶/调节因子对)和孤儿(孤立存在的)在每个基因组的信号网络中形成两个相对独立的组分。此外,虽然大多数基因预计只有少数相互作用伙伴,但我们发现10%的孤儿形成了一类单独的“枢纽”节点,这些节点向多达数十个不同的相互作用伙伴传递和整合信号。