Haile Simon, Corbett Richard D, O'Neill Kieran, Xu Jing, Smailus Duane E, Pandoh Pawan K, Bayega Anthony, Bala Miruna, Chuah Eric, Coope Robin J N, Moore Richard A, Mungall Karen L, Zhao Yongjun, Ma Yussanne, Marra Marco A, Jones Steven J M, Mungall Andrew J
Canada's Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC, Canada.
Department of Medical Genetics, University of British Columbia, Vancouver, BC, Canada.
Front Genet. 2024 Dec 2;15:1466338. doi: 10.3389/fgene.2024.1466338. eCollection 2024.
The advent of long-read (LR) sequencing technologies has provided a direct opportunity to determine the structure of transcripts with potential for end-to-end sequencing of full-length RNAs. LR methods that have been described to date include commercial offerings from Oxford Nanopore Technologies (ONT) and Pacific Biosciences. These kits are based on selection of polyadenylated (polyA+) RNAs and/or oligo-dT priming of reverse transcription. Thus, these approaches do not allow comprehensive interrogation of the transcriptome due to their exclusion of non-polyadenylated (polyA-) RNAs. In addition, polyA + specificity also results in 3'-biased measurements of PolyA+ RNAs especially when the RNA input is partially degraded. To address these limitations of current LR protocols, we modified rRNA depletion protocols that have been used in short-read sequencing: one approach representing a ligation-based method and the other a template-switch cDNA synthesis-based method to append ONT-specific adaptor sequences and by removing any deliberate fragmentation/shearing of RNA/cDNA. Here, we present comparisons with poly+ RNA-specific versions of the two approaches including the ONT PCR-cDNA Barcoding kit. The rRNA depletion protocols displayed higher proportions (30%-50%) of intronic content compared to that of the polyA-specific protocols (5%-8%). In addition, the rRNA depletion protocols enabled ∼20-50% higher detection of expressed genes. Other metrics that were favourable to the rRNA depletion protocols include better coverage of long transcripts, and higher accuracy and reproducibility of expression measurements. Overall, these results indicate that the rRNA depletion-based protocols described here allow the comprehensive characterization of polyadenylated and non-polyadenylated RNAs. While the resulting reads are long enough to help decipher transcript structures, future endeavors are warranted to improve the proportion of individual reads representing end-to-end spanning of transcripts.
长读长(LR)测序技术的出现为确定转录本结构提供了直接机会,具备对全长RNA进行端到端测序的潜力。迄今为止已描述的LR方法包括牛津纳米孔技术公司(ONT)和太平洋生物科学公司的商业产品。这些试剂盒基于对聚腺苷酸化(polyA+)RNA的选择和/或逆转录的寡聚dT引物。因此,由于排除了非聚腺苷酸化(polyA-)RNA,这些方法无法对转录组进行全面检测。此外,polyA+特异性还导致对polyA+RNA的3'偏向性测量,尤其是当RNA输入部分降解时。为了解决当前LR方案的这些局限性,我们修改了短读长测序中使用的rRNA去除方案:一种方法是基于连接的方法,另一种是基于模板转换cDNA合成的方法,以附加ONT特异性接头序列,并去除RNA/cDNA的任何故意片段化/剪切。在这里,我们展示了与这两种方法的poly+RNA特异性版本(包括ONT PCR-cDNA条形码试剂盒)的比较。与polyA特异性方案(5%-8%)相比,rRNA去除方案显示出更高比例(30%-50%)的内含子含量。此外,rRNA去除方案能够使表达基因的检测率提高约20%-50%。其他有利于rRNA去除方案的指标包括对长转录本的更好覆盖,以及表达测量的更高准确性和可重复性。总体而言,这些结果表明,本文所述的基于rRNA去除的方案能够对聚腺苷酸化和非聚腺苷酸化RNA进行全面表征。虽然所得读长足够长,有助于解读转录本结构,但未来仍需努力提高代表转录本端到端跨度的单个读长的比例。