Zhu Jun, Zhang Bin, Smith Erin N, Drees Becky, Brem Rachel B, Kruglyak Leonid, Bumgarner Roger E, Schadt Eric E
Rosetta Inpharmatics, LLC, Seattle, Washington 98109, USA.
Nat Genet. 2008 Jul;40(7):854-61. doi: 10.1038/ng.167. Epub 2008 Jun 15.
A key goal of biology is to construct networks that predict complex system behavior. We combine multiple types of molecular data, including genotypic, expression, transcription factor binding site (TFBS), and protein-protein interaction (PPI) data previously generated from a number of yeast experiments, in order to reconstruct causal gene networks. Networks based on different types of data are compared using metrics devised to assess the predictive power of a network. We show that a network reconstructed by integrating genotypic, TFBS and PPI data is the most predictive. This network is used to predict causal regulators responsible for hot spots of gene expression activity in a segregating yeast population. We also show that the network can elucidate the mechanisms by which causal regulators give rise to larger-scale changes in gene expression activity. We then prospectively validate predictions, providing direct experimental evidence that predictive networks can be constructed by integrating multiple, appropriate data types.
生物学的一个关键目标是构建能够预测复杂系统行为的网络。我们整合了多种类型的分子数据,包括先前从多个酵母实验中生成的基因型、表达、转录因子结合位点(TFBS)和蛋白质-蛋白质相互作用(PPI)数据,以重建因果基因网络。使用为评估网络预测能力而设计的指标对基于不同类型数据的网络进行比较。我们表明,通过整合基因型、TFBS和PPI数据重建的网络具有最强的预测能力。该网络用于预测导致分离酵母群体中基因表达活性热点的因果调节因子。我们还表明,该网络可以阐明因果调节因子引发基因表达活性大规模变化的机制。然后,我们对预测进行前瞻性验证,提供直接的实验证据,证明通过整合多种适当的数据类型可以构建预测性网络。