Vinciotti Veronica, Wit Ernst C, Jansen Rick, de Geus Eco J C N, Penninx Brenda W J H, Boomsma Dorret I, 't Hoen Peter A C
Department of Mathematics, Brunel University London, London, UK.
Johann Bernoulli Institute of Mathematics and Computer Science, University of Groningen, Groningen, The Netherlands.
BMC Bioinformatics. 2016 Jun 24;17:254. doi: 10.1186/s12859-016-1136-0.
Sparse Gaussian graphical models are popular for inferring biological networks, such as gene regulatory networks. In this paper, we investigate the consistency of these models across different data platforms, such as microarray and next generation sequencing, on the basis of a rich dataset containing samples that are profiled under both techniques as well as a large set of independent samples.
Our analysis shows that individual node variances can have a remarkable effect on the connectivity of the resulting network. Their inconsistency across platforms and the fact that the variability level of a node may not be linked to its regulatory role mean that, failing to scale the data prior to the network analysis, leads to networks that are not reproducible across different platforms and that may be misleading. Moreover, we show how the reproducibility of networks across different platforms is significantly higher if networks are summarised in terms of enrichment amongst functional groups of interest, such as pathways, rather than at the level of individual edges.
Careful pre-processing of transcriptional data and summaries of networks beyond individual edges can improve the consistency of network inference across platforms. However, caution is needed at this stage in the (over)interpretation of gene regulatory networks inferred from biological data.
稀疏高斯图形模型在推断生物网络(如基因调控网络)方面很受欢迎。在本文中,我们基于一个丰富的数据集来研究这些模型在不同数据平台(如微阵列和下一代测序)之间的一致性,该数据集包含在这两种技术下都进行了分析的样本以及大量独立样本。
我们的分析表明,单个节点方差可能对所得网络的连通性产生显著影响。它们在不同平台之间的不一致性,以及节点的变异水平可能与其调控作用无关这一事实,意味着在网络分析之前不对数据进行缩放会导致网络在不同平台之间不可重现,并且可能产生误导。此外,我们展示了如果根据感兴趣的功能组(如通路)中的富集情况而非单个边的层面来总结网络,那么不同平台之间网络的可重复性会显著提高。
对转录数据进行仔细的预处理以及对单个边之外的网络进行总结可以提高跨平台网络推断的一致性。然而,在现阶段对从生物数据推断出的基因调控网络进行(过度)解读时需要谨慎。