Genome Institute of Singapore (GIS), Agency for Science, Technology and Research (A*STAR), Singapore, Republic of Singapore.
Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Republic of Singapore.
Nat Methods. 2023 Aug;20(8):1187-1195. doi: 10.1038/s41592-023-01908-w. Epub 2023 Jun 12.
Most approaches to transcript quantification rely on fixed reference annotations; however, the transcriptome is dynamic and depending on the context, such static annotations contain inactive isoforms for some genes, whereas they are incomplete for others. Here we present Bambu, a method that performs machine-learning-based transcript discovery to enable quantification specific to the context of interest using long-read RNA-sequencing. To identify novel transcripts, Bambu estimates the novel discovery rate, which replaces arbitrary per-sample thresholds with a single, interpretable, precision-calibrated parameter. Bambu retains the full-length and unique read counts, enabling accurate quantification in presence of inactive isoforms. Compared to existing methods for transcript discovery, Bambu achieves greater precision without sacrificing sensitivity. We show that context-aware annotations improve quantification for both novel and known transcripts. We apply Bambu to quantify isoforms from repetitive HERVH-LTR7 retrotransposons in human embryonic stem cells, demonstrating the ability for context-specific transcript expression analysis.
大多数转录本定量方法都依赖于固定的参考注释; 然而,转录组是动态的,并且根据上下文,这些静态注释对于某些基因包含非活性异构体,而对于其他基因则不完整。在这里,我们介绍了 Bambu,这是一种基于机器学习的转录本发现方法,可使用长读 RNA 测序实现针对感兴趣上下文的定量分析。为了识别新的转录本,Bambu 估计了新的发现率,该方法用一个可解释的、经过精确校准的参数替代了任意的每个样本阈值。Bambu 保留了全长和唯一的读取计数,可在存在非活性异构体的情况下实现准确的定量。与现有的转录本发现方法相比,Bambu 在不牺牲敏感性的情况下实现了更高的精度。我们表明,上下文感知注释可提高新型和已知转录本的定量分析。我们应用 Bambu 对人类胚胎干细胞中重复的 HERVH-LTR7 逆转录转座子的异构体进行定量,展示了针对特定上下文的转录本表达分析的能力。