Guimarães Katia S, Przytycka Teresa M
National Center of Biotechnology, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA.
BMC Bioinformatics. 2008 Mar 26;9:171. doi: 10.1186/1471-2105-9-171.
The identification and characterization of interacting domain pairs is an important step towards understanding protein interactions. In the last few years, several methods to predict domain interactions have been proposed. Understanding the power and the limitations of these methods is key to the development of improved approaches and better understanding of the nature of these interactions.
Building on the previously published Parsimonious Explanation method (PE) to predict domain-domain interactions, we introduced a new Generalized Parsimonious Explanation (GPE) method, which (i) adjusts the granularity of the domain definition to the granularity of the input data set and (ii) permits domain interactions to have different costs. This allowed for preferential selection of the so-called "co-occurring domains" as possible mediators of interactions between proteins. The performance of both variants of the parsimony method are competitive to the performance of the top algorithms for this problem even though parsimony methods use less information than some of the other methods. We also examined possible enrichment of co-occurring domains and homo-domains among domain interactions mediating the interaction of proteins in the network. The corresponding study was performed by surveying domain interactions predicted by the GPE method as well as by using a combinatorial counting approach independent of any prediction method. Our findings indicate that, while there is a considerable propensity towards these special domain pairs among predicted domain interactions, this overrepresentation is significantly lower than in the iPfam dataset.
The Generalized Parsimonious Explanation approach provides a new means to predict and study domain-domain interactions. We showed that, under the assumption that all protein interactions in the network are mediated by domain interactions, there exists a significant deviation of the properties of domain interactions mediating interactions in the network from that of iPfam data.
识别和表征相互作用的结构域对是理解蛋白质相互作用的重要一步。在过去几年中,已经提出了几种预测结构域相互作用的方法。了解这些方法的优势和局限性是开发改进方法以及更好地理解这些相互作用本质的关键。
基于先前发表的用于预测结构域 - 结构域相互作用的简约解释方法(PE),我们引入了一种新的广义简约解释(GPE)方法,该方法(i)将结构域定义的粒度调整为输入数据集的粒度,并且(ii)允许结构域相互作用具有不同的成本。这使得能够优先选择所谓的“共现结构域”作为蛋白质之间相互作用的可能介导者。尽管简约方法使用的信息比其他一些方法少,但简约方法的两种变体的性能与解决此问题的顶级算法的性能具有竞争力。我们还研究了在介导网络中蛋白质相互作用的结构域相互作用中,共现结构域和同源结构域可能的富集情况。通过调查GPE方法预测的结构域相互作用以及使用独立于任何预测方法的组合计数方法进行了相应的研究。我们的研究结果表明,虽然在预测的结构域相互作用中,这些特殊结构域对具有相当大的倾向,但这种过度代表性明显低于iPfam数据集中的情况。
广义简约解释方法为预测和研究结构域 - 结构域相互作用提供了一种新手段。我们表明,在假设网络中所有蛋白质相互作用均由结构域相互作用介导的情况下,介导网络中相互作用的结构域相互作用的性质与iPfam数据的性质存在显著偏差。