Guttman Mitchell, Amit Ido, Garber Manuel, French Courtney, Lin Michael F, Feldser David, Huarte Maite, Zuk Or, Carey Bryce W, Cassady John P, Cabili Moran N, Jaenisch Rudolf, Mikkelsen Tarjei S, Jacks Tyler, Hacohen Nir, Bernstein Bradley E, Kellis Manolis, Regev Aviv, Rinn John L, Lander Eric S
Broad Institute of MIT and Harvard, 7 Cambridge Center, Cambridge, Massachusetts 02142, USA.
Nature. 2009 Mar 12;458(7235):223-7. doi: 10.1038/nature07672. Epub 2009 Feb 1.
There is growing recognition that mammalian cells produce many thousands of large intergenic transcripts. However, the functional significance of these transcripts has been particularly controversial. Although there are some well-characterized examples, most (>95%) show little evidence of evolutionary conservation and have been suggested to represent transcriptional noise. Here we report a new approach to identifying large non-coding RNAs using chromatin-state maps to discover discrete transcriptional units intervening known protein-coding loci. Our approach identified approximately 1,600 large multi-exonic RNAs across four mouse cell types. In sharp contrast to previous collections, these large intervening non-coding RNAs (lincRNAs) show strong purifying selection in their genomic loci, exonic sequences and promoter regions, with greater than 95% showing clear evolutionary conservation. We also developed a functional genomics approach that assigns putative functions to each lincRNA, demonstrating a diverse range of roles for lincRNAs in processes from embryonic stem cell pluripotency to cell proliferation. We obtained independent functional validation for the predictions for over 100 lincRNAs, using cell-based assays. In particular, we demonstrate that specific lincRNAs are transcriptionally regulated by key transcription factors in these processes such as p53, NFkappaB, Sox2, Oct4 (also known as Pou5f1) and Nanog. Together, these results define a unique collection of functional lincRNAs that are highly conserved and implicated in diverse biological processes.
越来越多的人认识到哺乳动物细胞会产生成千上万种大的基因间转录本。然而,这些转录本的功能意义一直存在特别大的争议。尽管有一些特征明确的例子,但大多数(>95%)几乎没有进化保守性的证据,有人认为它们代表转录噪声。在这里,我们报告了一种利用染色质状态图谱来识别大型非编码RNA的新方法,以发现介于已知蛋白质编码基因座之间的离散转录单元。我们的方法在四种小鼠细胞类型中鉴定出了大约1600种大型多外显子RNA。与之前的集合形成鲜明对比的是,这些大型居间非编码RNA(lincRNA)在其基因组位点、外显子序列和启动子区域表现出强烈的纯化选择,超过95%显示出明显的进化保守性。我们还开发了一种功能基因组学方法,为每个lincRNA赋予推定功能,证明了lincRNA在从胚胎干细胞多能性到细胞增殖等过程中的多种作用。我们使用基于细胞的检测方法对100多种lincRNA的预测进行了独立的功能验证。特别是,我们证明了特定的lincRNA在这些过程中受到关键转录因子如p53、NFκB Sox2、Oct4(也称为Pou5f1)和Nanog 的转录调控。总之,这些结果定义了一组独特的功能性lincRNA,它们高度保守并参与多种生物学过程。