Li Ai, Horvath Steve
Department of Biostatistics, School of Public Health, University of California, Los Angeles, CA 90095-1772, USA.
Bioinformatics. 2007 Jan 15;23(2):222-31. doi: 10.1093/bioinformatics/btl581. Epub 2006 Nov 16.
The goal of neighborhood analysis is to find a set of genes (the neighborhood) that is similar to an initial 'seed' set of genes. Neighborhood analysis methods for network data are important in systems biology. If individual network connections are susceptible to noise, it can be advantageous to define neighborhoods on the basis of a robust interconnectedness measure, e.g. the topological overlap measure. Since the use of multiple nodes in the seed set may lead to more informative neighborhoods, it can be advantageous to define multi-node similarity measures.
The pairwise topological overlap measure is generalized to multiple network nodes and subsequently used in a recursive neighborhood construction method. A local permutation scheme is used to determine the neighborhood size. Using four network applications and a simulated example, we provide empirical evidence that the resulting neighborhoods are biologically meaningful, e.g. we use neighborhood analysis to identify brain cancer related genes.
An executable Windows program and tutorial for multi-node topological overlap measure (MTOM) based analysis can be downloaded from the webpage (http://www.genetics.ucla.edu/labs/horvath/MTOM/).
邻域分析的目标是找到一组与初始“种子”基因集相似的基因(邻域)。网络数据的邻域分析方法在系统生物学中很重要。如果单个网络连接易受噪声影响,那么基于稳健的互连性度量(例如拓扑重叠度量)来定义邻域可能会更有利。由于在种子集中使用多个节点可能会导致更具信息性的邻域,因此定义多节点相似性度量可能会更有利。
将成对拓扑重叠度量推广到多个网络节点,并随后用于递归邻域构建方法。使用局部置换方案来确定邻域大小。通过四个网络应用和一个模拟示例,我们提供了实证证据,表明所得邻域具有生物学意义,例如我们使用邻域分析来识别与脑癌相关的基因。
可从网页(http://www.genetics.ucla.edu/labs/horvath/MTOM/)下载基于多节点拓扑重叠度量(MTOM)分析的可执行Windows程序和教程。