Department of Biotechnology and Biosciences, University of Milano-Bicocca, Piazza della Scienza 2, Milan, Italy.
BMC Bioinformatics. 2009 Oct 15;10 Suppl 12(Suppl 12):S1. doi: 10.1186/1471-2105-10-S12-S1.
The integration of data from multiple genome-wide assays is essential for understanding dynamic spatio-temporal interactions within cells. Such integration, which leads to a more complete view of cellular processes, offers the opportunity to rationalize better the high amount of "omics" data freely available in several public databases.In particular, integration of microarray-derived transcriptome data with other high-throughput analyses (genomic and mutational analysis, promoter analysis) may allow us to unravel transcriptional regulatory networks under a variety of physio-pathological situations, such as the alteration in the cross-talk between signal transduction pathways in transformed cells.
Here we sequentially apply web-based and statistical tools to a case study: the role of oncogenic activation of different signal transduction pathways in the transcriptional regulation of genes encoding proteins involved in the cAMP-PKA pathway. To this end, we first re-analyzed available genome-wide expression data for genes encoding proteins of the downstream branch of the PKA pathway in normal tissues and human tumor cell lines. Then, in order to identify mutation-dependent transcriptional signatures, we classified cancer cells as a function of their mutational state. The results of such procedure were used as a starting point to analyze the structure of PKA pathway-encoding genes promoters, leading to identification of specific combinations of transcription factor binding sites, which are neatly consistent with available experimental data and help to clarify the relation between gene expression, transcriptional factors and oncogenes in our case study.
Genome-wide, large-scale "omics" experimental technologies give different, complementary perspectives on the structure and regulatory properties of complex systems. Even the relatively simple, integrated workflow presented here offers opportunities not only for filtering data noise intrinsic in high throughput data, but also to progressively extract novel information that would have remained hidden otherwise. In fact we have been able to detect a strong transcriptional repression of genes encoding proteins of cAMP/PKA pathway in cancer cells of different genetic origins. The basic workflow presented herein may be easily extended by incorporating other tools and can be applied even by researchers with poor bioinformatics skills.
整合来自多个全基因组分析的数据对于理解细胞内动态时空相互作用至关重要。这种整合可以更全面地了解细胞过程,并为更好地理解多个公共数据库中提供的大量“组学”数据提供机会。特别是,将微阵列转录组数据与其他高通量分析(基因组和突变分析、启动子分析)进行整合,可以帮助我们在多种生理病理情况下揭示转录调控网络,例如信号转导途径之间的串扰改变在转化细胞中。
在这里,我们将基于网络的工具和统计工具应用于一个案例研究:不同信号转导途径的致癌激活在参与 cAMP-PKA 途径的基因转录调控中的作用。为此,我们首先重新分析了正常组织和人类肿瘤细胞系中编码 PKA 途径下游分支蛋白的基因的全基因组表达数据。然后,为了识别依赖于突变的转录特征,我们根据癌细胞的突变状态对其进行分类。该过程的结果被用作分析 PKA 途径编码基因启动子结构的起点,从而确定转录因子结合位点的特定组合,这些组合与可用的实验数据非常一致,有助于阐明我们案例研究中基因表达、转录因子和癌基因之间的关系。
全基因组、大规模的“组学”实验技术为复杂系统的结构和调控特性提供了不同的、互补的视角。即使是相对简单、集成的工作流程,也不仅为过滤高通量数据固有的数据噪声提供了机会,而且还为逐步提取否则可能隐藏的新信息提供了机会。事实上,我们已经能够检测到不同遗传起源的癌细胞中 cAMP/PKA 途径编码蛋白的强烈转录抑制。本文提出的基本工作流程可以通过纳入其他工具轻松扩展,并可应用于即使是生物信息学技能较差的研究人员。