Center for Bioinformatics and Genomics, Department of Global Biostatistics and Data Science, Tulane University School of Public Health and Tropical Medicine, New Orleans, LA, 70112, USA.
Department of Biomedical Engineering, Tulane University, New Orleans, LA, 70118, USA.
Sci Rep. 2019 Jul 26;9(1):10863. doi: 10.1038/s41598-019-47362-7.
Differential network analysis investigates how the network of connected genes changes from one condition to another and has become a prevalent tool to provide a deeper and more comprehensive understanding of the molecular etiology of complex diseases. Based on the asymptotically normal estimation of large Gaussian graphical model (GGM) in the high-dimensional setting, we developed a computationally efficient test for differential network analysis through testing the equality of two precision matrices, which summarize the conditional dependence network structures of the genes. Additionally, we applied a multiple testing procedure to infer the differential network structure with false discovery rate (FDR) control. Through extensive simulation studies with different combinations of parameters including sample size, number of vertices, level of heterogeneity and graph structure, we demonstrated that our method performed much better than the current available methods in terms of accuracy and computational time. In real data analysis on lung adenocarcinoma, we revealed a differential network with 3503 nodes and 2550 edges, which consisted of 50 clusters with an FDR threshold at 0.05. Many of the top gene pairs in the differential network have been reported relevant to human cancers. Our method represents a powerful tool of network analysis for high-dimensional biological data.
差异网络分析研究连接基因的网络如何从一种状态转变为另一种状态,已成为深入全面了解复杂疾病分子病因的流行工具。基于高维环境中大型高斯图形模型(GGM)的渐近正态估计,我们通过检验两个精度矩阵的相等性,开发了一种用于差异网络分析的计算高效检验方法,这两个精度矩阵总结了基因的条件依赖网络结构。此外,我们应用多重检验程序,通过控制错误发现率(FDR)推断差异网络结构。通过对不同参数组合(包括样本量、顶点数量、异质性水平和图形结构)的广泛模拟研究,我们证明了我们的方法在准确性和计算时间方面都优于当前可用的方法。在肺腺癌的实际数据分析中,我们揭示了一个包含 3503 个节点和 2550 个边的差异网络,该网络由 50 个具有 FDR 阈值为 0.05 的聚类组成。差异网络中的许多顶级基因对已被报道与人类癌症有关。我们的方法代表了一种用于高维生物数据的网络分析的强大工具。