Department for BioMedical Research (DBMR), University of Bern, 3008 Bern, Switzerland.
Department of Medical Oncology, Inselspital, University Hospital and University of Bern, 3010 Bern, Switzerland.
Genome Res. 2019 Feb;29(2):208-222. doi: 10.1101/gr.229922.117. Epub 2018 Dec 26.
The sequence domains underlying long noncoding RNA (lncRNA) activities, including their characteristic nuclear enrichment, remain largely unknown. It has been proposed that these domains can originate from neofunctionalized fragments of transposable elements (TEs), otherwise known as RIDLs (repeat insertion domains of lncRNA), although just a handful have been identified. It is challenging to distinguish functional RIDL instances against a numerous genomic background of neutrally evolving TEs. We here show evidence that a subset of TE types experience evolutionary selection in the context of lncRNA exons. Together these comprise an enrichment group of 5374 TE fragments in 3566 loci. Their host lncRNAs tend to be functionally validated and associated with disease. This RIDL group was used to explore the relationship between TEs and lncRNA subcellular localization. By using global localization data from 10 human cell lines, we uncover a dose-dependent relationship between nuclear/cytoplasmic distribution and evolutionarily conserved L2b, MIRb, and MIRc elements. This is observed in multiple cell types and is unaffected by confounders of transcript length or expression. Experimental validation with engineered transgenes shows that these TEs drive nuclear enrichment in a natural sequence context. Together these data reveal a role for TEs in regulating the subcellular localization of lncRNAs.
长链非编码 RNA(lncRNA)活性的序列结构域,包括其特征性的核富集,在很大程度上仍然未知。据推测,这些结构域可能源自转座元件(TEs)的新功能化片段,也称为 RIDLs(lncRNA 的重复插入结构域),尽管只鉴定了少数几个。在大量中性进化的 TEs 的基因组背景下,区分功能性 RIDL 实例具有挑战性。我们在这里提供证据表明,一组 TE 类型在 lncRNA 外显子的背景下经历了进化选择。这些共同构成了 3566 个基因座中 5374 个 TE 片段的富集组。它们的宿主 lncRNAs 往往具有功能验证,并与疾病相关。该 RIDL 组用于探索 TE 和 lncRNA 亚细胞定位之间的关系。通过使用来自 10 个人类细胞系的全局定位数据,我们揭示了核/细胞质分布与进化上保守的 L2b、MIRb 和 MIRc 元件之间的剂量依赖性关系。这种关系在多种细胞类型中都存在,不受转录本长度或表达的混杂因素的影响。用工程化转基因进行的实验验证表明,这些 TE 在自然序列背景下驱动核富集。这些数据共同揭示了 TE 在调节 lncRNA 亚细胞定位中的作用。