Ramsay LeeAnn, Marchetto Maria C, Caron Maxime, Chen Shu-Huang, Busche Stephan, Kwan Tony, Pastinen Tomi, Gage Fred H, Bourque Guillaume
Department of Human Genetics, McGill University, Dr Penfield Avenue, Montreal, H3A 1B1, Canada.
Lab of Genetics, Salk Institute for Biological Studies, 10010 N Torrey Pines Rd, La Jolla, CA 92037, USA.
BMC Genomics. 2017 Feb 28;18(1):214. doi: 10.1186/s12864-017-3568-y.
A significant portion of expressed non-coding RNAs in human cells is derived from transposable elements (TEs). Moreover, it has been shown that various long non-coding RNAs (lncRNAs), which come from the human endogenous retrovirus subfamily H (HERVH), are not only expressed but required for pluripotency in human embryonic stem cells (hESCs).
To identify additional TE-derived functional non-coding transcripts, we generated RNA-seq data from induced pluripotent stem cells (iPSCs) of four primate species (human, chimpanzee, gorilla, and rhesus) and searched for transcripts whose expression was conserved. We observed that about 30% of TE instances expressed in human iPSCs had orthologous TE instances that were also expressed in chimpanzee and gorilla. Notably, our analysis revealed a number of repeat families with highly conserved expression profiles including HERVH but also MER53, which is known to be the source of a placental-specific family of microRNAs (miRNAs). We also identified a number of repeat families from all classes of TEs, including MLT1-type and Tigger families, that contributed a significant amount of sequence to primate lncRNAs whose expression was conserved.
Together, these results describe TE families and TE-derived lncRNAs whose conserved expression patterns can be used to identify what are likely functional TE-derived non-coding transcripts in primate iPSCs.
人类细胞中很大一部分表达的非编码RNA源自转座元件(TEs)。此外,研究表明,来自人类内源性逆转录病毒H亚家族(HERVH)的各种长链非编码RNA(lncRNAs)不仅在人类胚胎干细胞(hESCs)中表达,而且对其多能性是必需的。
为了鉴定其他源自TE的功能性非编码转录本,我们从四种灵长类动物(人类、黑猩猩、大猩猩和恒河猴)的诱导多能干细胞(iPSCs)中生成了RNA测序数据,并搜索了表达保守的转录本。我们观察到,在人类iPSCs中表达的TE实例中,约30%有在黑猩猩和大猩猩中也表达的直系同源TE实例。值得注意的是,我们的分析揭示了许多具有高度保守表达谱的重复家族,包括HERVH,但也有MER53,已知它是胎盘特异性微小RNA(miRNAs)家族的来源。我们还从所有类型的TE中鉴定出许多重复家族,包括MLT1型和Tigger家族,它们为表达保守的灵长类lncRNAs贡献了大量序列。
总之,这些结果描述了TE家族和源自TE的lncRNAs,其保守的表达模式可用于鉴定灵长类iPSCs中可能具有功能的源自TE的非编码转录本。