CeMM Research Center for Molecular Medicine of the Austrian Academy of Sciences, Vienna, Austria.
PLoS One. 2011;6(11):e27288. doi: 10.1371/journal.pone.0027288. Epub 2011 Nov 10.
Imprinted macro non-protein-coding (nc) RNAs are cis-repressor transcripts that silence multiple genes in at least three imprinted gene clusters in the mouse genome. Similar macro or long ncRNAs are abundant in the mammalian genome. Here we present the full coding and non-coding transcriptome of two mouse tissues: differentiated ES cells and fetal head using an optimized RNA-Seq strategy. The data produced is highly reproducible in different sequencing locations and is able to detect the full length of imprinted macro ncRNAs such as Airn and Kcnq1ot1, whose length ranges between 80-118 kb. Transcripts show a more uniform read coverage when RNA is fragmented with RNA hydrolysis compared with cDNA fragmentation by shearing. Irrespective of the fragmentation method, all coding and non-coding transcripts longer than 8 kb show a gradual loss of sequencing tags towards the 3' end. Comparisons to published RNA-Seq datasets show that the strategy presented here is more efficient in detecting known functional imprinted macro ncRNAs and also indicate that standardization of RNA preparation protocols would increase the comparability of the transcriptome between different RNA-Seq datasets.
印迹的大型非蛋白编码(nc)RNAs 是顺式抑制转录本,它们沉默小鼠基因组中至少三个印迹基因簇中的多个基因。类似的大型或长 ncRNAs 在哺乳动物基因组中大量存在。在这里,我们使用优化的 RNA-Seq 策略展示了两种小鼠组织(分化的 ES 细胞和胎头)的完整编码和非编码转录组。在不同的测序位置,所产生的数据具有高度的可重复性,并且能够检测到全长印迹大型 ncRNAs,如 Airn 和 Kcnq1ot1,其长度在 80-118kb 之间。与 cDNA 片段化相比,用 RNA 水解片段化 RNA 时,转录本的读长覆盖度更加均匀。无论使用哪种片段化方法,所有长度大于 8kb 的编码和非编码转录本在 3'端都逐渐失去测序标签。与已发表的 RNA-Seq 数据集的比较表明,这里提出的策略在检测已知功能的印迹大型 ncRNAs 方面更为有效,并且还表明 RNA 制备方案的标准化将增加不同 RNA-Seq 数据集之间转录组的可比性。