Department of Biomolecular Engineering, University of California, Santa Cruz, CA 95064.
Genomics Institute, University of California, Santa Cruz, CA 95064.
Proc Natl Acad Sci U S A. 2018 Sep 25;115(39):9726-9731. doi: 10.1073/pnas.1806447115. Epub 2018 Sep 10.
High-throughput short-read sequencing has revolutionized how transcriptomes are quantified and annotated. However, while Illumina short-read sequencers can be used to analyze entire transcriptomes down to the level of individual splicing events with great accuracy, they fall short of analyzing how these individual events are combined into complete RNA transcript isoforms. Because of this shortfall, long-distance information is required to complement short-read sequencing to analyze transcriptomes on the level of full-length RNA transcript isoforms. While long-read sequencing technology can provide this long-distance information, there are issues with both Pacific Biosciences (PacBio) and Oxford Nanopore Technologies (ONT) long-read sequencing technologies that prevent their widespread adoption. Briefly, PacBio sequencers produce low numbers of reads with high accuracy, while ONT sequencers produce higher numbers of reads with lower accuracy. Here, we introduce and validate a long-read ONT-based sequencing method. At the same cost, our Rolling Circle Amplification to Concatemeric Consensus (R2C2) method generates more accurate reads of full-length RNA transcript isoforms than any other available long-read sequencing method. These reads can then be used to generate isoform-level transcriptomes for both genome annotation and differential expression analysis in bulk or single-cell samples.
高通量短读测序技术极大地改变了转录组的定量和注释方式。然而,虽然 Illumina 短读测序仪可以非常准确地分析整个转录组,甚至可以分析到单个剪接事件的水平,但它们无法分析这些单个事件如何组合成完整的 RNA 转录本异构体。由于这一缺陷,需要长距离信息来补充短读测序,以分析全长 RNA 转录本异构体水平的转录组。虽然长读测序技术可以提供这种长距离信息,但 Pacific Biosciences(PacBio)和 Oxford Nanopore Technologies(ONT)长读测序技术都存在一些问题,限制了它们的广泛应用。简单来说,PacBio 测序仪产生的高质量读取数量较少,而 ONT 测序仪产生的读取数量较多,但准确性较低。在这里,我们引入并验证了一种基于 ONT 的长读测序方法。在相同的成本下,我们的 Rolling Circle Amplification to Concatemeric Consensus(R2C2)方法生成的全长 RNA 转录本异构体的准确读数比任何其他可用的长读测序方法都多。然后,可以使用这些读数来生成批量或单细胞样本中基因组注释和差异表达分析的异构体水平转录组。