Kpogbezan Gino B, van der Vaart Aad W, van Wieringen Wessel N, Leday Gwenaël G R, van de Wiel Mark A
Department of Mathematics, University of Leiden, Niels Bohrweg 1, 2333, CA Leiden, The Netherlands.
Department of Mathematics, Vrije Universiteit Amsterdam, De Boelelaan 1081, 1081, HV Amsterdam, The Netherlands.
Biom J. 2017 Sep;59(5):932-947. doi: 10.1002/bimj.201600090. Epub 2017 Apr 10.
Reconstruction of a high-dimensional network may benefit substantially from the inclusion of prior knowledge on the network topology. In the case of gene interaction networks such knowledge may come for instance from pathway repositories like KEGG, or be inferred from data of a pilot study. The Bayesian framework provides a natural means of including such prior knowledge. Based on a Bayesian Simultaneous Equation Model, we develop an appealing Empirical Bayes (EB) procedure that automatically assesses the agreement of the used prior knowledge with the data at hand. We use variational Bayes method for posterior densities approximation and compare its accuracy with that of Gibbs sampling strategy. Our method is computationally fast, and can outperform known competitors. In a simulation study, we show that accurate prior data can greatly improve the reconstruction of the network, but need not harm the reconstruction if wrong. We demonstrate the benefits of the method in an analysis of gene expression data from GEO. In particular, the edges of the recovered network have superior reproducibility (compared to that of competitors) over resampled versions of the data.
纳入网络拓扑的先验知识可能会极大地有利于高维网络的重建。就基因相互作用网络而言,此类知识可能例如来自像KEGG这样的通路知识库,或者从初步研究的数据中推断得出。贝叶斯框架提供了纳入此类先验知识的自然方式。基于贝叶斯联立方程模型,我们开发了一种引人注目的经验贝叶斯(EB)程序,该程序能自动评估所使用的先验知识与手头数据的一致性。我们使用变分贝叶斯方法来近似后验密度,并将其准确性与吉布斯采样策略的准确性进行比较。我们的方法计算速度快,并且可以胜过已知的竞争对手。在一项模拟研究中,我们表明准确的先验数据可以极大地改善网络的重建,但如果数据错误也不一定会损害重建。我们在对来自GEO的基因表达数据的分析中展示了该方法的优势。特别是,与数据的重采样版本相比,恢复网络的边具有更高的重现性(与竞争对手相比)。