Pei Pengjun, Zhang Aidong
Department of Computer Science and Engineering, State University of New York at Buffalo, Buffalo, NY 14260, USA.
Proc IEEE Comput Syst Bioinform Conf. 2005:268-78. doi: 10.1109/csb.2005.8.
High-throughput methods for detecting protein-protein interactions (PPI) have given researchers an initial global picture of protein interactions on a genomic scale. The usefulness of this understanding is, however, typically compromised by noisy data. The effective way of integrating and using these non-congruent data sets has received little attention to date. This paper proposes a model to integrate different data sets. We construct this model using our prior knowledge of data set reliability. Based on this model, we propose a topological measurement to select reliable interactions and to quantify the similarity between two proteins' interaction profiles. Our measurement exploits the small-world network topological properties of protein interaction network. Meanwhile, we discovered some additional properties of the network. We show that our measurement can be used to find reliable interactions with improved performance and to find protein pairs with higher function homogeneity.
用于检测蛋白质-蛋白质相互作用(PPI)的高通量方法,已使研究人员初步了解了基因组规模上蛋白质相互作用的全局情况。然而,这种理解的实用性通常会受到噪声数据的影响。整合和使用这些不一致数据集的有效方法,至今很少受到关注。本文提出了一个整合不同数据集的模型。我们利用对数据集可靠性的先验知识构建了这个模型。基于此模型,我们提出了一种拓扑测量方法,以选择可靠的相互作用,并量化两种蛋白质相互作用图谱之间的相似性。我们的测量方法利用了蛋白质相互作用网络的小世界网络拓扑特性。同时,我们还发现了该网络的一些其他特性。我们表明,我们的测量方法可用于以更高的性能找到可靠的相互作用,并找到具有更高功能同质性的蛋白质对。