Max Planck Institute for Dynamics of Complex Technical Systems, Sandtorstr. 1, 39106 Magdeburg, Germany.
Bioinformatics. 2013 Jan 15;29(2):246-54. doi: 10.1093/bioinformatics/bts679. Epub 2012 Nov 21.
Systems Genetics approaches, in particular those relying on genetical genomics data, put forward a new paradigm of large-scale genome and network analysis. These methods use naturally occurring multi-factorial perturbations (e.g. polymorphisms) in properly controlled and screened genetic crosses to elucidate causal relationships in biological networks. However, although genetical genomics data contain rich information, a clear dissection of causes and effects as required for reconstructing gene regulatory networks is not easily possible.
We present a framework for reconstructing gene regulatory networks from genetical genomics data where genotype and phenotype correlation measures are used to derive an initial graph which is subsequently reduced by pruning strategies to minimize false positive predictions. Applied to realistic simulated genetic data from a recent DREAM challenge, we demonstrate that our approach is simple yet effective and outperforms more complex methods (including the best performer) with respect to (i) reconstruction quality (especially for small sample sizes) and (ii) applicability to large data sets due to relatively low computational costs. We also present reconstruction results from real genetical genomics data of yeast.
A MATLAB implementation (script) of the reconstruction framework is available at www.mpi-magdeburg.mpg.de/projects/cna/etcdownloads.html
系统遗传学方法,特别是依赖于遗传基因组学数据的方法,提出了一种大规模基因组和网络分析的新范例。这些方法利用自然发生的多因素扰动(例如多态性),在适当控制和筛选的遗传杂交中,阐明生物网络中的因果关系。然而,尽管遗传基因组学数据包含丰富的信息,但对于重建基因调控网络来说,很难清晰地解析出所需的因果关系。
我们提出了一个从遗传基因组学数据中重建基因调控网络的框架,其中使用基因型和表型相关性度量来推导出初始图,然后通过修剪策略来减少该图,以最小化假阳性预测。将其应用于最近 DREAM 挑战的真实模拟遗传数据,我们证明了我们的方法简单而有效,并且在(i)重建质量(尤其是对于小样本量)和(ii)由于相对较低的计算成本而适用于大型数据集方面,优于更复杂的方法(包括最佳表现者)。我们还展示了来自酵母真实遗传基因组学数据的重建结果。
该重建框架的 MATLAB 实现(脚本)可在 www.mpi-magdeburg.mpg.de/projects/cna/etcdownloads.html 获得。