Lopes-Ramos Camila M, Paulson Joseph N, Chen Cho-Yi, Kuijjer Marieke L, Fagny Maud, Platig John, Sonawane Abhijeet R, DeMeo Dawn L, Quackenbush John, Glass Kimberly
Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, Boston, MA, USA.
Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA.
BMC Genomics. 2017 Sep 12;18(1):723. doi: 10.1186/s12864-017-4111-x.
Cell lines are an indispensable tool in biomedical research and often used as surrogates for tissues. Although there are recognized important cellular and transcriptomic differences between cell lines and tissues, a systematic overview of the differences between the regulatory processes of a cell line and those of its tissue of origin has not been conducted. The RNA-Seq data generated by the GTEx project is the first available data resource in which it is possible to perform a large-scale transcriptional and regulatory network analysis comparing cell lines with their tissues of origin.
We compared 127 paired Epstein-Barr virus transformed lymphoblastoid cell lines (LCLs) and whole blood samples, and 244 paired primary fibroblast cell lines and skin samples. While gene expression analysis confirms that these cell lines carry the expression signatures of their primary tissues, albeit at reduced levels, network analysis indicates that expression changes are the cumulative result of many previously unreported alterations in transcription factor (TF) regulation. More specifically, cell cycle genes are over-expressed in cell lines compared to primary tissues, and this alteration in expression is a result of less repressive TF targeting. We confirmed these regulatory changes for four TFs, including SMAD5, using independent ChIP-seq data from ENCODE.
Our results provide novel insights into the regulatory mechanisms controlling the expression differences between cell lines and tissues. The strong changes in TF regulation that we observe suggest that network changes, in addition to transcriptional levels, should be considered when using cell lines as models for tissues.
细胞系是生物医学研究中不可或缺的工具,常被用作组织的替代物。尽管细胞系和组织之间存在公认的重要细胞和转录组差异,但尚未对细胞系及其起源组织的调控过程差异进行系统概述。GTEx项目生成的RNA-Seq数据是首个可用于进行大规模转录和调控网络分析以比较细胞系与其起源组织的可用数据资源。
我们比较了127对爱泼斯坦-巴尔病毒转化的淋巴母细胞系(LCLs)和全血样本,以及244对原代成纤维细胞系和皮肤样本。虽然基因表达分析证实这些细胞系带有其原代组织的表达特征,尽管水平有所降低,但网络分析表明表达变化是转录因子(TF)调控中许多先前未报道的改变的累积结果。更具体地说,与原代组织相比,细胞周期基因在细胞系中过度表达,这种表达改变是由于TF靶向抑制作用减弱所致。我们使用来自ENCODE的独立ChIP-seq数据证实了包括SMAD5在内的四种TF的这些调控变化。
我们的结果为控制细胞系和组织之间表达差异的调控机制提供了新的见解。我们观察到的TF调控的强烈变化表明,在将细胞系用作组织模型时,除了转录水平外,还应考虑网络变化。