Department of Pathology, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA.
McKusick-Nathans Institute of Genetics, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA.
Nucleic Acids Res. 2019 Mar 18;47(5):e27. doi: 10.1093/nar/gky1301.
Transposable elements (TEs) are interspersed repeat sequences that make up much of the human genome. Their expression has been implicated in development and disease. However, TE-derived RNA-seq reads are difficult to quantify. Past approaches have excluded these reads or aggregated RNA expression to subfamilies shared by similar TE copies, sacrificing quantitative accuracy or the genomic context necessary to understand the basis of TE transcription. As a result, the effects of TEs on gene expression and associated phenotypes are not well understood. Here, we present Software for Quantifying Interspersed Repeat Expression (SQuIRE), the first RNA-seq analysis pipeline that provides a quantitative and locus-specific picture of TE expression (https://github.com/wyang17/SQuIRE). SQuIRE is an accurate and user-friendly tool that can be used for a variety of species. We applied SQuIRE to RNA-seq from normal mouse tissues and a Drosophila model of amyotrophic lateral sclerosis. In both model organisms, we recapitulated previously reported TE subfamily expression levels and revealed locus-specific TE expression. We also identified differences in TE transcription patterns relating to transcript type, gene expression and RNA splicing that would be lost with other approaches using subfamily-level analyses. Altogether, our findings illustrate the importance of studying TE transcription with locus-level resolution.
转座元件 (TEs) 是散布在基因组中的重复序列,它们构成了人类基因组的很大一部分。它们的表达与发育和疾病有关。然而,TE 衍生的 RNA-seq 读段很难定量。过去的方法要么排除这些读段,要么将 RNA 表达聚集到相似 TE 拷贝共享的亚家族中,从而牺牲了定量准确性或理解 TE 转录基础所需的基因组背景。因此,TE 对基因表达和相关表型的影响还不是很清楚。在这里,我们提出了用于量化散布重复序列表达的软件 (SQuIRE),这是第一个提供 TE 表达的定量和特定基因座的 RNA-seq 分析管道(https://github.com/wyang17/SQuIRE)。SQuIRE 是一种准确且用户友好的工具,可用于各种物种。我们将 SQuIRE 应用于正常小鼠组织和肌萎缩侧索硬化症的果蝇模型的 RNA-seq。在这两种模式生物中,我们重现了先前报道的 TE 亚家族表达水平,并揭示了特定基因座的 TE 表达。我们还发现,与使用亚家族水平分析的其他方法相比,与转录本类型、基因表达和 RNA 剪接相关的 TE 转录模式存在差异。总之,我们的研究结果说明了使用基因座水平分辨率研究 TE 转录的重要性。