Centre for Stem Cell Systems, Department of Anatomy and Physiology, The University of Melbourne, Parkville, VIC, Australia.
European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK.
Nucleic Acids Res. 2022 Feb 28;50(4):e19. doi: 10.1093/nar/gkab1129.
Accurately quantifying gene and isoform expression changes is essential to understanding cell functions, differentiation and disease. Sequencing full-length native RNAs using long-read direct RNA sequencing (DRS) has the potential to overcome many limitations of short and long-read sequencing methods that require RNA fragmentation, cDNA synthesis or PCR. However, there are a lack of tools specifically designed for DRS and its ability to identify differential expression in complex organisms is poorly characterised. We developed NanoCount for fast, accurate transcript isoform quantification in DRS and demonstrate it outperforms similar methods. Using synthetic controls and human SH-SY5Y cell differentiation into neuron-like cells, we show that DRS accurately quantifies RNA expression and identifies differential expression of genes and isoforms. Differential expression of 231 genes, 333 isoforms, plus 27 isoform switches were detected between undifferentiated and differentiated SH-SY5Y cells and samples clustered by differentiation state at the gene and isoform level. Genes upregulated in neuron-like cells were associated with neurogenesis. NanoCount quantification of thousands of novel isoforms discovered with DRS likewise enabled identification of their differential expression. Our results demonstrate enhanced DRS isoform quantification with NanoCount and establish the ability of DRS to identify biologically relevant differential expression of genes and isoforms.
准确量化基因和异构体表达变化对于理解细胞功能、分化和疾病至关重要。使用长读长直接 RNA 测序 (DRS) 对全长天然 RNA 进行测序有可能克服许多需要 RNA 片段化、cDNA 合成或 PCR 的短读长和长读长测序方法的局限性。然而,目前缺乏专门为 DRS 设计的工具,并且其在复杂生物体中识别差异表达的能力尚未得到充分描述。我们开发了 NanoCount 用于 DRS 中快速、准确的转录本异构体定量,并证明它优于类似的方法。使用合成对照和人 SH-SY5Y 细胞分化为类神经元细胞,我们表明 DRS 可以准确地定量 RNA 表达并识别基因和异构体的差异表达。在未分化和分化的 SH-SY5Y 细胞之间检测到 231 个基因、333 个异构体和 27 个异构体开关的差异表达,并且样品在基因和异构体水平上按分化状态聚类。在类神经元细胞中上调的基因与神经发生有关。NanoCount 对 DRS 发现的数千个新异构体的定量同样能够识别它们的差异表达。我们的结果表明 NanoCount 增强了 DRS 异构体的定量,并证明了 DRS 识别基因和异构体的生物学相关差异表达的能力。