Law Cheuk-Ting, Burns Kathleen H
Department of Pathology, Dana-Farber Cancer Institute, Boston, Massachusetts 02115, USA.
Department of Pathology, Harvard Medical School, Boston, Massachusetts 02115, USA.
bioRxiv. 2025 Feb 3:2025.02.02.635956. doi: 10.1101/2025.02.02.635956.
Long interspersed element-1 (LINE-1, L1) retrotransposons are the most abundant protein-coding transposable elements (TE) in mammalian genomes, and have shaped genome content over 170 million years of evolution. LINE-1 is self-propagating and mobilizes other sequences, including elements. Occasionally, LINE-1 forms chimeric insertions with non-coding RNAs and mRNAs. U6 spliceosomal small nuclear RNA/LINE-1 chimeras are best known, though there are no comprehensive catalogs of LINE-1 chimeras. To address this, we developed TiMEstamp, a computational pipeline that leverages multiple sequence alignments (MSA) to estimate the age of LINE-1 insertions and identify candidate chimeric insertions where an adjacent sequence arrives contemporaneously. Candidates were refined by detecting hallmark features of L1 retrotransposition, such as target site duplication (TSD). Applying this pipeline to the human genome, we recovered all known species of LINE-1 chimeras and discovered new chimeric insertions involving small RNAs, elements, and mRNA fragments. Some insertions are compatible with known mechanisms, such as RNA ligation. Other structures nominate novel mechanisms, such as trans-splicing. We also see evidence that LINE-1 loci with defunct promoters can acquire regulatory elements from nearby genes to restore retrotransposition activity. These discoveries highlight the recombinatory potential of LINE-1 RNA with implications for genome evolution and TE domestication.
长散在核元件1(LINE-1,L1)逆转座子是哺乳动物基因组中最丰富的蛋白质编码转座元件(TE),在超过1.7亿年的进化过程中塑造了基因组内容。LINE-1能够自我增殖并移动其他序列,包括元件。偶尔,LINE-1会与非编码RNA和mRNA形成嵌合插入。U6剪接体小核RNA/LINE-1嵌合体最为人所知,不过目前尚无LINE-1嵌合体的全面目录。为了解决这一问题,我们开发了TiMEstamp,这是一种计算流程,它利用多序列比对(MSA)来估计LINE-1插入的年龄,并识别相邻序列同时出现的候选嵌合插入。通过检测L1逆转座的标志性特征(如靶位点重复,TSD)来优化候选序列。将该流程应用于人类基因组,我们找回了所有已知种类的LINE-1嵌合体,并发现了涉及小RNA、元件和mRNA片段的新嵌合插入。一些插入与已知机制兼容,如RNA连接。其他结构则提示了新的机制,如反式剪接。我们还发现有证据表明,启动子失效的LINE-1位点可以从附近基因获取调控元件以恢复逆转座活性。这些发现突出了LINE-1 RNA的重组潜力,对基因组进化和转座元件驯化具有重要意义。