Veiga Diogo F T, Dutta Bhaskar, Balázsi Gábor
Department of Systems Biology-Unit 950, The University of Texas M. D. Anderson Cancer Center, Houston, TX, USA.
Mol Biosyst. 2010 Mar;6(3):469-80. doi: 10.1039/b916989j. Epub 2009 Dec 11.
The escalating amount of genome-scale data demands a pragmatic stance from the research community. How can we utilize this deluge of information to better understand biology, cure diseases, or engage cells in bioremediation or biomaterial production for various purposes? A research pipeline moving new sequence, expression and binding data towards practical end goals seems to be necessary. While most individual researchers are not motivated by such well-articulated pragmatic end goals, the scientific community has already self-organized itself to successfully convert genomic data into fundamentally new biological knowledge and practical applications. Here we review two important steps in this workflow: network inference and network response identification, applied to transcriptional regulatory networks. Among network inference methods, we concentrate on relevance networks due to their conceptual simplicity. We classify and discuss network response identification approaches as either data-centric or network-centric. Finally, we conclude with an outlook on what is still missing from these approaches and what may be ahead on the road to biological discovery.
基因组规模数据量的不断增加,要求研究界采取务实的态度。我们如何利用这海量的信息,来更好地理解生物学、治愈疾病,或让细胞参与各种目的的生物修复或生物材料生产呢?一条将新的序列、表达和结合数据导向实际最终目标的研究流程似乎是必要的。虽然大多数个体研究人员并非受此类明确阐述的务实最终目标所驱动,但科学界已经自行组织起来,成功地将基因组数据转化为全新的生物学知识和实际应用。在此,我们回顾这一工作流程中的两个重要步骤:网络推断和网络响应识别,应用于转录调控网络。在网络推断方法中,由于其概念简单,我们重点关注相关性网络。我们将网络响应识别方法分类并讨论为以数据为中心或以网络为中心。最后,我们展望这些方法仍欠缺的内容以及在生物学发现之路上可能出现的情况。