Kelley David R, Kingsford Carl
Department of Computer Science and Center for Bioinformatics and Computational Biology, University of Maryland, College Park, Maryland, USA.
J Comput Biol. 2011 Mar;18(3):379-90. doi: 10.1089/cmb.2010.0268.
Abstract Genetic interactions (such as synthetic lethal interactions) have become quantifiable on a large-scale using the epistatic miniarray profile (E-MAP) method. An E-MAP allows the construction of a large, weighted network of both aggravating and alleviating genetic interactions between genes. By clustering genes into modules and establishing relationships between those modules, we can discover compensatory pathways. We introduce a general framework for applying greedy clustering heuristics to probabilistic graphs. We use this framework to apply a graph clustering method called graph summarization to an E-MAP that targets yeast chromosome biology. This results in a new method for clustering E-MAP data that we call Expected Graph Compression (EGC). We validate modules and compensatory pathways using enriched Gene Ontology annotations and a novel method based on correlated gene expression. EGC finds a number of modules that are not found by any previous methods to cluster E-MAP data. EGC also uncovers core submodules contained within several previously found modules, suggesting that EGC can reveal the finer structure of E-MAP networks.
摘要 利用上位性微阵列分析(E-MAP)方法,遗传相互作用(如合成致死相互作用)已能够在大规模水平上进行量化。E-MAP允许构建一个由基因之间加剧和缓解遗传相互作用组成的大型加权网络。通过将基因聚类成模块并建立这些模块之间的关系,我们可以发现补偿途径。我们引入了一个将贪婪聚类启发式方法应用于概率图的通用框架。我们使用这个框架将一种称为图汇总的图聚类方法应用于针对酵母染色体生物学的E-MAP。这产生了一种新的对E-MAP数据进行聚类的方法,我们称之为期望图压缩(EGC)。我们使用富集的基因本体注释和一种基于相关基因表达的新方法来验证模块和补偿途径。EGC发现了一些以前任何聚类E-MAP数据的方法都未发现的模块。EGC还揭示了几个先前发现的模块中包含的核心子模块,这表明EGC可以揭示E-MAP网络的更精细结构。