Voevodski Konstantin, Teng Shang-Hua, Xia Yu
Department of Computer Science, Boston University, Boston, MA 02215, USA.
BMC Syst Biol. 2009 Nov 29;3:112. doi: 10.1186/1752-0509-3-112.
Protein-protein interaction (PPI) networks enable us to better understand the functional organization of the proteome. We can learn a lot about a particular protein by querying its neighborhood in a PPI network to find proteins with similar function. A spectral approach that considers random walks between nodes of interest is particularly useful in evaluating closeness in PPI networks. Spectral measures of closeness are more robust to noise in the data and are more precise than simpler methods based on edge density and shortest path length.
We develop a novel affinity measure for pairs of proteins in PPI networks, which uses personalized PageRank, a random walk based method used in context-sensitive search on the Web. Our measure of closeness, which we call PageRank Affinity, is proportional to the number of times the smaller-degree protein is visited in a random walk that restarts at the larger-degree protein. PageRank considers paths of all lengths in a network, therefore PageRank Affinity is a precise measure that is robust to noise in the data. PageRank Affinity is also provably related to cluster co-membership, making it a meaningful measure. In our experiments on protein networks we find that our measure is better at predicting co-complex membership and finding functionally related proteins than other commonly used measures of closeness. Moreover, our experiments indicate that PageRank Affinity is very resilient to noise in the network. In addition, based on our method we build a tool that quickly finds nodes closest to a queried protein in any protein network, and easily scales to much larger biological networks.
We define a meaningful way to assess the closeness of two proteins in a PPI network, and show that our closeness measure is more biologically significant than other commonly used methods. We also develop a tool, accessible at http://xialab.bu.edu/resources/pnns, that allows the user to quickly find nodes closest to a queried vertex in any protein network available from BioGRID or specified by the user.
蛋白质-蛋白质相互作用(PPI)网络使我们能够更好地理解蛋白质组的功能组织。通过在PPI网络中查询特定蛋白质的邻域来找到具有相似功能的蛋白质,我们可以了解到很多关于该特定蛋白质的信息。一种考虑感兴趣节点之间随机游走的谱方法在评估PPI网络中的接近度时特别有用。接近度的谱度量对数据中的噪声更具鲁棒性,并且比基于边密度和最短路径长度的更简单方法更精确。
我们为PPI网络中的蛋白质对开发了一种新颖的亲和度度量,它使用个性化PageRank,这是一种基于随机游走的方法,用于网络上的上下文敏感搜索。我们的接近度度量,我们称之为PageRank亲和度,与在从度数较大的蛋白质重新开始的随机游走中访问度数较小的蛋白质的次数成正比。PageRank考虑网络中所有长度的路径,因此PageRank亲和度是一种精确的度量,对数据中的噪声具有鲁棒性。PageRank亲和度也被证明与聚类共成员关系相关,使其成为一种有意义的度量。在我们对蛋白质网络的实验中,我们发现我们的度量在预测共复合体成员关系和找到功能相关蛋白质方面比其他常用的接近度度量更好。此外,我们的实验表明PageRank亲和度对网络中的噪声非常有弹性。此外,基于我们的方法,我们构建了一个工具,可以快速在任何蛋白质网络中找到最接近查询蛋白质的节点,并且很容易扩展到更大的生物网络。
我们定义了一种有意义的方法来评估PPI网络中两种蛋白质的接近度,并表明我们的接近度度量比其他常用方法在生物学上更具意义。我们还开发了一个工具,可在http://xialab.bu.edu/resources/pnns上访问,该工具允许用户在BioGRID提供的或用户指定的任何蛋白质网络中快速找到最接近查询顶点的节点。