Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, New York, USA.
Commun Biol. 2022 Oct 6;5(1):1063. doi: 10.1038/s42003-022-04020-5.
Transposable Elements (TEs) contribute to the repetitive fraction in almost every eukaryotic genome known to date, and their transcriptional activation can influence the expression of neighboring genes in healthy and disease states. Single cell RNA-Seq (scRNA-Seq) is a technical advance that allows the study of gene expression on a cell-by-cell basis. Although a current computational approach is available for the single cell analysis of TE expression, it omits their genomic location. Here we show SoloTE, a pipeline that outperforms the previous approach in terms of computational resources and by allowing the inclusion of locus-specific TE activity in scRNA-Seq expression matrixes. We then apply SoloTE to several datasets to reveal the repertoire of TEs that become transcriptionally active in different cell groups, and based on their genomic location, we predict their potential impact on gene expression. As our tool takes as input the resulting files from standard scRNA-Seq processing pipelines, we expect it to be widely adopted in single cell studies to help researchers discover patterns of cellular diversity associated with TE expression.
转座元件 (TEs) 在迄今为止已知的几乎每个真核生物基因组中都构成了重复序列的一部分,它们的转录激活可以影响健康和疾病状态下邻近基因的表达。单细胞 RNA 测序 (scRNA-Seq) 是一项技术进步,允许在单细胞基础上研究基因表达。尽管目前有一种用于单细胞 TE 表达分析的计算方法,但它忽略了它们的基因组位置。在这里,我们展示了 SoloTE,这是一种在计算资源方面优于以前方法的管道,并且允许在 scRNA-Seq 表达矩阵中包含特定基因座的 TE 活性。然后,我们将 SoloTE 应用于几个数据集,以揭示在不同细胞群中转录活跃的 TE 谱,并基于其基因组位置,预测它们对基因表达的潜在影响。由于我们的工具将标准 scRNA-Seq 处理管道的结果文件作为输入,我们预计它将在单细胞研究中得到广泛采用,以帮助研究人员发现与 TE 表达相关的细胞多样性模式。