Canada's Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, British Columbia, Canada V5Z 4S6.
Department of Medical Genetics, University of British Columbia, Vancouver, British Columbia, Canada V6H 3N1.
Genome Res. 2020 Aug;30(8):1191-1200. doi: 10.1101/gr.260174.119. Epub 2020 Aug 17.
Despite the rapid advance in single-cell RNA sequencing (scRNA-seq) technologies within the last decade, single-cell transcriptome analysis workflows have primarily used gene expression data while isoform sequence analysis at the single-cell level still remains fairly limited. Detection and discovery of isoforms in single cells is difficult because of the inherent technical shortcomings of scRNA-seq data, and existing transcriptome assembly methods are mainly designed for bulk RNA samples. To address this challenge, we developed RNA-Bloom, an assembly algorithm that leverages the rich information content aggregated from multiple single-cell transcriptomes to reconstruct cell-specific isoforms. Assembly with RNA-Bloom can be either reference-guided or reference-free, thus enabling unbiased discovery of novel isoforms or foreign transcripts. We compared both assembly strategies of RNA-Bloom against five state-of-the-art reference-free and reference-based transcriptome assembly methods. In our benchmarks on a simulated 384-cell data set, reference-free RNA-Bloom reconstructed 37.9%-38.3% more isoforms than the best reference-free assembler, whereas reference-guided RNA-Bloom reconstructed 4.1%-11.6% more isoforms than reference-based assemblers. When applied to a real 3840-cell data set consisting of more than 4 billion reads, RNA-Bloom reconstructed 9.7%-25.0% more isoforms than the best competing reference-based and reference-free approaches evaluated. We expect RNA-Bloom to boost the utility of scRNA-seq data beyond gene expression analysis, expanding what is informatically accessible now.
尽管单细胞 RNA 测序 (scRNA-seq) 技术在过去十年中取得了快速进展,但单细胞转录组分析工作流程主要使用基因表达数据,而单细胞水平的异构体序列分析仍然相当有限。由于 scRNA-seq 数据固有的技术缺陷,单细胞中异构体的检测和发现较为困难,并且现有的转录组组装方法主要针对批量 RNA 样本设计。为了解决这一挑战,我们开发了 RNA-Bloom,这是一种组装算法,利用从多个单细胞转录组中聚合的丰富信息内容来重建细胞特异性异构体。RNA-Bloom 的组装既可以是基于参考的,也可以是无参考的,从而能够实现对新异构体或外来转录本的无偏发现。我们将 RNA-Bloom 的两种组装策略与五种最先进的无参考和基于参考的转录组组装方法进行了比较。在我们对模拟的 384 个细胞数据集的基准测试中,无参考 RNA-Bloom 比最佳的无参考组装器重建了 37.9%-38.3%更多的异构体,而基于参考的 RNA-Bloom 比基于参考的组装器重建了 4.1%-11.6%更多的异构体。当应用于由超过 40 亿个读取组成的真实 3840 个细胞数据集时,RNA-Bloom 比评估的最佳竞争基于参考和无参考的方法重建了 9.7%-25.0%更多的异构体。我们预计 RNA-Bloom 将提高 scRNA-seq 数据的实用性,超越基因表达分析,扩大现在可访问的信息量。