Azad Md Fakhrul, Tong Tong, Lau Nelson C
Department of Biochemistry and Cell Biology, Boston University Chobanian & Avedisian School of Medicine, Boston, MA, 02118, USA.
Graduate Program in Bioinformatics, Boston University, Boston, MA, 02118, USA.
Mob DNA. 2024 Oct 9;15(1):20. doi: 10.1186/s13100-024-00330-z.
Recent studies have suggested that Transposable Elements (TEs) residing in introns frequently splice into and alter primary gene-coding transcripts. To re-examine the exonization frequency of TEs into protein-coding gene transcripts, we re-analyzed a Drosophila neuron circadian rhythm RNAseq dataset and a deep long RNA fly midbrain RNAseq dataset using our Transposon Insertion and Depletion Analyzer (TIDAL) program. Our TIDAL results were able to predict several TE insertions from RNAseq data that were consistent with previous published studies. However, we also uncovered many discrepancies in TE-exonization calls, such as reads that mainly support intron retention of the TE and little support for chimeric mRNA spliced to the TE. We then deployed rigorous genomic DNA-PCR (gDNA-PCR) and RT-PCR procedures on TE-mRNA fusion candidates to see how many of bioinformatics predictions could be validated. By testing a w1118 strain from which the deeper long RNAseq data was derived and comparing to an OreR strain, only 9 of 23 TIDAL candidates (< 40%) could be validated as a novel TE insertion by gDNA-PCR, indicating that deeper study is needed when using RNAseq data as inputs into current TE-insertion prediction programs. Of these validated calls, our RT-PCR results only supported TE-intron retention. Lastly, in the Dscam2 and Bx genes of the w1118 strain that contained intronic TEs, gene expression was 23 times higher than the OreR genes lacking the TEs. This study's validation approach indicates that chimeric TE-mRNAs are infrequent and cautions that more optimization is required in bioinformatics programs to call TE insertions using RNAseq datasets.
最近的研究表明,内含子中的转座元件(TEs)经常剪接到初级基因编码转录本中并改变它们。为了重新审视TEs外显子化到蛋白质编码基因转录本中的频率,我们使用我们的转座子插入和缺失分析器(TIDAL)程序重新分析了果蝇神经元昼夜节律RNAseq数据集和果蝇中脑深度长RNA RNAseq数据集。我们的TIDAL结果能够从RNAseq数据中预测出一些与先前发表的研究一致的TE插入。然而,我们也发现了TE外显子化调用中的许多差异,例如主要支持TE内含子保留的读数,而对剪接到TE的嵌合mRNA支持很少。然后,我们对TE-mRNA融合候选物采用了严格的基因组DNA-PCR(gDNA-PCR)和RT-PCR程序,以查看生物信息学预测中有多少可以得到验证。通过测试来自其获得更深长RNAseq数据的w1118菌株并与OreR菌株进行比较,23个TIDAL候选物中只有9个(<40%)可以通过gDNA-PCR验证为新的TE插入,这表明在将RNAseq数据用作当前TE插入预测程序的输入时需要更深入的研究。在这些经过验证的调用中,我们的RT-PCR结果仅支持TE内含子保留。最后,在含有内含子TEs的w1118菌株的Dscam2和Bx基因中,基因表达比缺乏TEs的OreR基因高23倍。这项研究的验证方法表明嵌合TE-mRNAs很少见,并提醒在使用RNAseq数据集调用TE插入的生物信息学程序中需要更多优化。