EMBL/CRG Research Unit in Systems Biology, Centre for Genomic Regulation-CRG and Universitat Pompeu Fabra-UPF, Barcelona, Spain.
PLoS Comput Biol. 2012;8(7):e1002589. doi: 10.1371/journal.pcbi.1002589. Epub 2012 Jul 12.
Understanding the complex regulatory networks underlying development and evolution of multi-cellular organisms is a major problem in biology. Computational models can be used as tools to extract the regulatory structure and dynamics of such networks from gene expression data. This approach is called reverse engineering. It has been successfully applied to many gene networks in various biological systems. However, to reconstitute the structure and non-linear dynamics of a developmental gene network in its spatial context remains a considerable challenge. Here, we address this challenge using a case study: the gap gene network involved in segment determination during early development of Drosophila melanogaster. A major problem for reverse-engineering pattern-forming networks is the significant amount of time and effort required to acquire and quantify spatial gene expression data. We have developed a simplified data processing pipeline that considerably increases the throughput of the method, but results in data of reduced accuracy compared to those previously used for gap gene network inference. We demonstrate that we can infer the correct network structure using our reduced data set, and investigate minimal data requirements for successful reverse engineering. Our results show that timing and position of expression domain boundaries are the crucial features for determining regulatory network structure from data, while it is less important to precisely measure expression levels. Based on this, we define minimal data requirements for gap gene network inference. Our results demonstrate the feasibility of reverse-engineering with much reduced experimental effort. This enables more widespread use of the method in different developmental contexts and organisms. Such systematic application of data-driven models to real-world networks has enormous potential. Only the quantitative investigation of a large number of developmental gene regulatory networks will allow us to discover whether there are rules or regularities governing development and evolution of complex multi-cellular organisms.
理解多细胞生物发育和进化背后复杂的调控网络是生物学中的一个主要问题。计算模型可以用作从基因表达数据中提取这些网络的调控结构和动态的工具。这种方法被称为反向工程。它已成功应用于许多不同生物系统中的基因网络。然而,要在空间背景下重建发育基因网络的结构和非线性动态仍然是一个相当大的挑战。在这里,我们通过一个案例研究来解决这个挑战:参与果蝇早期发育过程中节段确定的间隙基因网络。反向工程模式形成网络的一个主要问题是,获取和量化空间基因表达数据需要大量的时间和精力。我们开发了一个简化的数据处理管道,大大提高了该方法的通量,但与以前用于间隙基因网络推断的数据相比,数据的准确性降低。我们证明,我们可以使用我们简化的数据集推断出正确的网络结构,并研究成功反向工程所需的最小数据要求。我们的结果表明,表达域边界的时间和位置是从数据中确定调控网络结构的关键特征,而精确测量表达水平则不太重要。基于此,我们为间隙基因网络推断定义了最小数据要求。我们的结果证明了使用大大减少的实验工作量进行反向工程的可行性。这使得该方法在不同的发育背景和生物体中得到更广泛的应用。这种基于数据的模型在真实网络中的系统应用具有巨大的潜力。只有对大量发育基因调控网络进行定量研究,才能发现是否存在支配复杂多细胞生物发育和进化的规则或规律。