Computer Science and Artificial Intelligence Laboratory (CSAIL), Massachusetts Institute of Technology (MIT), Cambridge, Massachusetts, USA.
Nat Biotechnol. 2013 Aug;31(8):726-33. doi: 10.1038/nbt.2635. Epub 2013 Jul 14.
Recognizing direct relationships between variables connected in a network is a pervasive problem in biological, social and information sciences as correlation-based networks contain numerous indirect relationships. Here we present a general method for inferring direct effects from an observed correlation matrix containing both direct and indirect effects. We formulate the problem as the inverse of network convolution, and introduce an algorithm that removes the combined effect of all indirect paths of arbitrary length in a closed-form solution by exploiting eigen-decomposition and infinite-series sums. We demonstrate the effectiveness of our approach in several network applications: distinguishing direct targets in gene expression regulatory networks; recognizing directly interacting amino-acid residues for protein structure prediction from sequence alignments; and distinguishing strong collaborations in co-authorship social networks using connectivity information alone. In addition to its theoretical impact as a foundational graph theoretic tool, our results suggest network deconvolution is widely applicable for computing direct dependencies in network science across diverse disciplines.
识别网络中连接变量之间的直接关系是生物、社会和信息科学中的一个普遍问题,因为基于相关关系的网络包含许多间接关系。在这里,我们提出了一种从包含直接和间接效应的观测相关矩阵中推断直接效应的通用方法。我们将该问题表述为网络卷积的逆问题,并引入了一种算法,通过利用特征分解和无穷级数和,以封闭形式解去除任意长度的所有间接路径的组合效应。我们在几个网络应用中证明了我们方法的有效性:区分基因表达调控网络中的直接靶标;从序列比对中识别蛋白质结构预测中直接相互作用的氨基酸残基;以及仅使用连通性信息区分合著社交网络中的强协作。除了作为基础图论工具的理论影响外,我们的结果表明,网络去卷积在不同学科的网络科学中广泛适用于计算直接依赖性。