Sahlin Kristoffer, Medvedev Paul
Department of Mathematics, Science for Life Laboratory, Stockholm University, 106 91, Stockholm, Sweden.
Department of Computer Science and Engineering, The Pennsylvania State University, University Park, PA, USA.
Nat Commun. 2021 Jan 4;12(1):2. doi: 10.1038/s41467-020-20340-8.
Oxford Nanopore (ONT) is a leading long-read technology which has been revolutionizing transcriptome analysis through its capacity to sequence the majority of transcripts from end-to-end. This has greatly increased our ability to study the diversity of transcription mechanisms such as transcription initiation, termination, and alternative splicing. However, ONT still suffers from high error rates which have thus far limited its scope to reference-based analyses. When a reference is not available or is not a viable option due to reference-bias, error correction is a crucial step towards the reconstruction of the sequenced transcripts and downstream sequence analysis of transcripts. In this paper, we present a novel computational method to error correct ONT cDNA sequencing data, called isONcorrect. IsONcorrect is able to jointly use all isoforms from a gene during error correction, thereby allowing it to correct reads at low sequencing depths. We are able to obtain a median accuracy of 98.9-99.6%, demonstrating the feasibility of applying cost-effective cDNA full transcript length sequencing for reference-free transcriptome analysis.
牛津纳米孔(ONT)是一项领先的长读长技术,它通过对大多数转录本进行端到端测序的能力,彻底改变了转录组分析。这极大地提高了我们研究转录机制多样性的能力,如转录起始、终止和可变剪接。然而,ONT仍然存在高错误率,这迄今为止限制了其在基于参考的分析中的应用范围。当由于参考偏差而没有可用的参考序列或参考序列不可行时,纠错是重建测序转录本和进行转录本下游序列分析的关键步骤。在本文中,我们提出了一种用于纠错ONT cDNA测序数据的新型计算方法,称为isONcorrect。IsONcorrect能够在纠错过程中联合使用来自一个基因的所有异构体,从而使其能够在低测序深度下校正 reads。我们能够获得98.9 - 99.6%的中位数准确率,证明了应用具有成本效益的cDNA全长转录本测序进行无参考转录组分析的可行性。