Department of Genetic Medicine, Weill Cornell Medical College, New York, NY 10065, USA.
Bioinformatics. 2012 Aug 1;28(15):2029-36. doi: 10.1093/bioinformatics/bts312. Epub 2012 Jun 8.
Computational inference methods that make use of graphical models to extract regulatory networks from gene expression data can have difficulty reconstructing dense regions of a network, a consequence of both computational complexity and unreliable parameter estimation when sample size is small. As a result, identification of hub genes is of special difficulty for these methods.
We present a new algorithm, Empirical Light Mutual Min (ELMM), for large network reconstruction that has properties well suited for recovery of graphs with high-degree nodes. ELMM reconstructs the undirected graph of a regulatory network using empirical Bayes conditional independence testing with a heuristic relaxation of independence constraints in dense areas of the graph. This relaxation allows only one gene of a pair with a putative relation to be aware of the network connection, an approach that is aimed at easing multiple testing problems associated with recovering densely connected structures.
Using in silico data, we show that ELMM has better performance than commonly used network inference algorithms including GeneNet, ARACNE, FOCI, GENIE3 and GLASSO. We also apply ELMM to reconstruct a network among 5492 genes expressed in human lung airway epithelium of healthy non-smokers, healthy smokers and individuals with chronic obstructive pulmonary disease assayed using microarrays. The analysis identifies dense sub-networks that are consistent with known regulatory relationships in the lung airway and also suggests novel hub regulatory relationships among a number of genes that play roles in oxidative stress and secretion.
Software for running ELMM is made available at http://mezeylab.cb.bscb.cornell.edu/Software.aspx.
ramimahdi@yahoo.com or jgm45@cornell.edu
Supplementary data are available at Bioinformatics online.
利用图形模型从基因表达数据中提取调控网络的计算推理方法在重建网络密集区域时可能会遇到困难,这是由于计算复杂性和样本量较小时参数估计不可靠的双重原因。因此,这些方法特别难以识别枢纽基因。
我们提出了一种新的算法,即经验轻互最小化(ELMM),用于大型网络重建,该算法具有恢复具有高节点度图的良好特性。ELMM 使用经验贝叶斯条件独立性检验,通过对图中密集区域的独立性约束进行启发式松弛,重建调控网络的无向图。这种松弛仅允许一对具有假定关系的基因中的一个感知网络连接,这种方法旨在缓解与恢复密集连接结构相关的多重检验问题。
使用模拟数据,我们表明 ELMM 的性能优于常用的网络推断算法,包括 GeneNet、ARACNE、FOCI、GENIE3 和 GLASSO。我们还应用 ELMM 重建了在人类非吸烟健康气道上皮细胞中表达的 5492 个基因之间的网络,这些细胞使用微阵列进行了分析,包括健康吸烟者和慢性阻塞性肺疾病患者。该分析确定了密集的子网络,这些子网络与肺气道中的已知调控关系一致,还表明了一些在氧化应激和分泌中发挥作用的基因之间新的枢纽调控关系。
运行 ELMM 的软件可在 http://mezeylab.cb.bscb.cornell.edu/Software.aspx 获得。
ramimahdi@yahoo.com 或 jgm45@cornell.edu
补充数据可在 Bioinformatics 在线获得。