Department of Computer Science, Memorial University of Newfoundland, St John's, Newfoundland and Labrador, Canada.
PLoS One. 2014 Feb 19;9(2):e88943. doi: 10.1371/journal.pone.0088943. eCollection 2014.
Identifying reliable domain-domain interactions will increase our ability to predict novel protein-protein interactions, to unravel interactions in protein complexes, and thus gain more information about the function and behavior of genes. One of the challenges of identifying reliable domain-domain interactions is domain promiscuity. Promiscuous domains are domains that can occur in many domain architectures and are therefore found in many proteins. This becomes a problem for a method where the score of a domain-pair is the ratio between observed and expected frequencies because the protein-protein interaction network is sparse. As such, many protein-pairs will be non-interacting and domain-pairs with promiscuous domains will be penalized. This domain promiscuity challenge to the problem of inferring reliable domain-domain interactions from protein-protein interactions has been recognized, and a number of work-arounds have been proposed. This paper reports on an application of Formal Concept Analysis to this problem. It is found that the relationship between formal concepts provides a natural way for rare domains to elevate the rank of promiscuous domain-pairs and enrich highly ranked domain-pairs with reliable domain-domain interactions. This piggybacking of promiscuous domain-pairs onto less promiscuous domain-pairs is possible only with concept lattices whose attribute-labels are not reduced and is enhanced by the presence of proteins that comprise both promiscuous and rare domains.
识别可靠的域-域相互作用将提高我们预测新的蛋白质-蛋白质相互作用、揭示蛋白质复合物中相互作用的能力,从而获得更多关于基因功能和行为的信息。识别可靠的域-域相互作用的挑战之一是域混杂性。混杂域是指可以出现在许多结构域架构中的域,因此存在于许多蛋白质中。对于一种基于观察到的和预期的频率之比来计算域对得分的方法来说,这就成了一个问题,因为蛋白质-蛋白质相互作用网络是稀疏的。因此,许多蛋白质对将是非相互作用的,而具有混杂域的域对将受到惩罚。这种从蛋白质-蛋白质相互作用推断可靠的域-域相互作用的问题中的域混杂性挑战已经得到了认识,并提出了一些解决方法。本文报告了将形式概念分析应用于该问题的情况。结果发现,形式概念之间的关系为罕见域提供了一种自然的方式,使混杂的域对的等级升高,并使高等级的域对富含可靠的域-域相互作用。只有在属性标签没有减少的概念格上,混杂的域对才能搭便车到不那么混杂的域对上,并且存在包含混杂和罕见域的蛋白质会增强这种情况。