Nakachi Yutaka, Du Jianbin, Watanabe Risa, Yanagida Yutaro, Bundo Miki, Iwamoto Kazuya
Department of Molecular Brain Science, Graduate School of Medical Sciences, Kumamoto University, Kumamoto, Japan.
Department of Geriatric Psychiatry, The Affiliated Mental Health Center of Jiangnan University, Wuxi, China.
Front Bioinform. 2025 May 22;5:1575346. doi: 10.3389/fbinf.2025.1575346. eCollection 2025.
Retrotransposon long interspersed nuclear element-1 (LINE-1, L1) constitutes a large proportion of the mammalian genome. A fraction of L1s, which have no deleterious mutations in the structure, can amplify their copies via a process called retrotransposition (RT). RT affects genome stability and gene expression and is involved in the pathogenesis of many hereditary diseases. Measuring expression of RT-capable L1s (rc-L1s) among the hundreds of thousands of non rc-L1s is an essential step to understand the impact of RT. We developed mobile element-originated read enrichment from RNA-seq data (MORE-RNAseq), a pipeline for calculating expression of rc-L1s using manually curated L1 references in humans and mice. MORE-RNAseq allows for quantification of expression levels of overall (sum of the expression of all rc-L1s) and individual rc-L1s with consideration of the genomic context. We applied MORE-RNAseq to publicly available RNA-seq data of human and mouse cancer cell lines from the studies that reported increased L1 expression. We found the significant increase of rc-L1 expressions at the overall level in both inter- and intragenic contexts. We also identified differentially expressed rc-L1s at the locus level, which will be the important candidates for downstream analysis. We also applied our method to young and aged human muscle RNA-seq data with no prior information about L1 expression, and found a significant increase of rc-L1 expression in the aged samples. Our method will contribute to understand the role of rc-L1s in various physiological and pathophysiological conditions using standard RNA-seq data. All scripts are available at https://github.com/molbrain/MORE-RNAseq.
逆转录转座子长散在核元件1(LINE-1,L1)在哺乳动物基因组中占很大比例。一部分结构上没有有害突变的L1可以通过一种称为逆转录转座(RT)的过程扩增其拷贝。RT影响基因组稳定性和基因表达,并参与许多遗传性疾病的发病机制。在数十万非rc-L1中测量具有RT能力的L1(rc-L1)的表达是了解RT影响的关键步骤。我们开发了基于RNA测序数据的移动元件起源的读数富集法(MORE-RNAseq),这是一种使用人工策划的人类和小鼠L1参考序列来计算rc-L1表达的流程。MORE-RNAseq能够在考虑基因组背景的情况下,对整体(所有rc-L1表达的总和)和单个rc-L1的表达水平进行定量。我们将MORE-RNAseq应用于来自报道L1表达增加的研究中的公开可用的人类和小鼠癌细胞系的RNA测序数据。我们发现在基因间和基因内背景下,rc-L1的整体表达水平均显著增加。我们还在基因座水平上鉴定出差异表达的rc-L1,这将是下游分析的重要候选对象。我们还将我们的方法应用于没有L1表达先验信息的年轻和老年人类肌肉RNA测序数据,发现在老年样本中rc-L1表达显著增加。我们的方法将有助于利用标准RNA测序数据了解rc-L1在各种生理和病理生理条件下的作用。所有脚本可在https://github.com/molbrain/MORE-RNAseq获取。