Computational Biology Unit, Department of Informatics, University of Bergen, 5008 Bergen, Norway.
Sars International Centre for Marine Molecular Biology, University of Bergen, 5008 Bergen, Norway.
RNA. 2019 Oct;25(10):1229-1241. doi: 10.1261/rna.071332.119. Epub 2019 Jul 2.
Polyadenylation at the 3'-end is a major regulator of messenger RNA and its length is known to affect nuclear export, stability, and translation, among others. Only recently have strategies emerged that allow for genome-wide poly(A) length assessment. These methods identify genes connected to poly(A) tail measurements indirectly by short-read alignment to genetic 3'-ends. Concurrently, Oxford Nanopore Technologies (ONT) established full-length isoform-specific RNA sequencing containing the entire poly(A) tail. However, assessing poly(A) length through base-calling has so far not been possible due to the inability to resolve long homopolymeric stretches in ONT sequencing. Here we present , an R package to estimate poly(A) tail length on ONT long-read sequencing data. operates on unaligned, base-called data. It measures poly(A) tail length from both native RNA and DNA sequencing, which makes poly(A) tail studies by full-length cDNA approaches possible for the first time. We assess 's performance across different poly(A) lengths, demonstrating that is a versatile tool providing poly(A) tail estimates across a wide range of sequencing conditions.
3' 端的多聚腺苷酸化是信使 RNA 的主要调控因子,其长度已知会影响核输出、稳定性和翻译等。直到最近,才出现了允许进行全基因组多聚(A)长度评估的策略。这些方法通过短读序列比对到遗传 3' 末端,间接识别与多聚(A)尾测量相关的基因。同时,Oxford Nanopore Technologies(ONT)建立了全长异构体特异性 RNA 测序,其中包含整个多聚(A)尾。然而,由于 ONT 测序无法解析长的同源多聚体延伸,因此通过碱基调用评估多聚(A)长度到目前为止还不可能。在这里,我们提出了 ,这是一个用于在 ONT 长读测序数据上估计多聚(A)尾长度的 R 包。 无需比对,直接基于碱基调用数据运行。它可以从原始 RNA 和 DNA 测序中测量多聚(A)尾的长度,这使得首次有可能通过全长 cDNA 方法进行多聚(A)尾研究。我们评估了 在不同多聚(A)长度下的性能,证明 是一种通用工具,可在广泛的测序条件下提供多聚(A)尾估计值。