Silverbush Dana, Elberfeld Michael, Sharan Roded
Blavatnik School of Computer Science, Tel Aviv University, Tel Aviv, Israel.
J Comput Biol. 2011 Nov;18(11):1437-48. doi: 10.1089/cmb.2011.0163. Epub 2011 Oct 14.
In a network orientation problem, one is given a mixed graph, consisting of directed and undirected edges, and a set of source-target vertex pairs. The goal is to orient the undirected edges so that a maximum number of pairs admit a directed path from the source to the target. This NP-complete problem arises in the context of analyzing physical networks of protein-protein and protein-DNA interactions. While the latter are naturally directed from a transcription factor to a gene, the direction of signal flow in protein-protein interactions is often unknown or cannot be measured en masse. One then tries to infer this information by using causality data on pairs of genes such that the perturbation of one gene changes the expression level of the other gene. Here we provide a first polynomial-size ILP formulation for this problem, which can be efficiently solved on current networks. We apply our algorithm to orient protein-protein interactions in yeast and measure our performance using edges with known orientations. We find that our algorithm achieves high accuracy and coverage in the orientation, outperforming simplified algorithmic variants that do not use information on edge directions. The obtained orientations can lead to a better understanding of the structure and function of the network.
在网络定向问题中,给定一个由有向边和无向边组成的混合图以及一组源 - 目标顶点对。目标是对无向边进行定向,以使最大数量的顶点对存在从源到目标的有向路径。这个NP完全问题出现在分析蛋白质 - 蛋白质和蛋白质 - DNA相互作用的物理网络的背景下。虽然蛋白质 - DNA相互作用自然是从转录因子指向基因,但蛋白质 - 蛋白质相互作用中的信号流方向通常是未知的,或者无法整体测量。然后人们试图通过使用基因对的因果关系数据来推断此信息,即一个基因的扰动会改变另一个基因的表达水平。在这里,我们为这个问题提供了第一个多项式规模的整数线性规划(ILP)公式,它可以在当前网络上高效求解。我们将我们的算法应用于确定酵母中蛋白质 - 蛋白质相互作用的方向,并使用具有已知方向的边来衡量我们的性能。我们发现我们的算法在定向方面实现了高精度和高覆盖率,优于不使用边方向信息的简化算法变体。所获得的定向可以有助于更好地理解网络的结构和功能。