Britto-Borges Thiago, Gehring Niels H, Boehm Volker, Dieterich Christoph
Section of Bioinformatics and Systems Cardiology, Department of Internal Medicine III and Klaus Tschira Institute for Integrative Computational Cardiology, Heidelberg University Hospital, 69120 Heidelberg, Germany.
DZHK (German Centre for Cardiovascular Research), Partner site Heidelberg/Mannheim, 69120 Heidelberg, Germany.
RNA. 2024 Sep 16;30(10):1277-1291. doi: 10.1261/rna.080066.124.
The nonsense-mediated RNA decay (NMD) pathway is a crucial mechanism of mRNA quality control. Current annotations of NMD substrate RNAs are rarely data-driven, but use generally established rules. We present a data set with four cell lines and combinations for , , and knockdowns or knockout. Based on this data set, we implemented a workflow that combines Nanopore and Illumina sequencing to assemble a transcriptome, which is enriched for NMD target transcripts. Moreover, we use coding sequence information (CDS) from Ensembl, Gencode consensus Ribo-seq ORFs, and OpenProt to enhance the CDS annotation of novel transcript isoforms. In summary, 302,889 transcripts were obtained from the transcriptome assembly process, out of which 24% are absent from Ensembl database annotations, 48,213 contain a premature stop codon, and 6433 are significantly upregulated in three or more comparisons of NMD active versus deficient cell lines. We present an in-depth view of these results through the NMDtxDB database, which is available at https://shiny.dieterichlab.org/app/NMDtxDB, and supports the study of NMD-sensitive transcripts. We open sourced our implementation of the respective web-application and analysis workflow at https://github.com/dieterich-lab/NMDtxDB and https://github.com/dieterich-lab/nmd-wf.
无义介导的RNA衰变(NMD)途径是mRNA质量控制的关键机制。目前对NMD底物RNA的注释很少基于数据驱动,而是使用普遍确立的规则。我们提供了一个数据集,包含四种细胞系以及针对 、 和 的敲低或 敲除组合。基于此数据集,我们实施了一个工作流程,该流程结合纳米孔测序和Illumina测序来组装转录组,该转录组富含NMD靶转录本。此外,我们使用来自Ensembl、Gencode共识核糖体测序开放阅读框(ORF)和OpenProt的编码序列信息(CDS)来增强新转录本异构体的CDS注释。总之,从转录组组装过程中获得了302,889个转录本,其中24%在Ensembl数据库注释中缺失,48,213个包含提前终止密码子,6433个在NMD活性细胞系与缺陷细胞系的三次或更多次比较中显著上调。我们通过NMDtxDB数据库对这些结果进行了深入展示,该数据库可在https://shiny.dieterichlab.org/app/NMDtxDB获取,并支持对NMD敏感转录本的研究。我们在https://github.com/dieterich-lab/NMDtxDB和https://github.com/dieterich-lab/nmd-wf上开源了各自的网络应用程序和分析工作流程的实现。