Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA.
Molecular and Cellular Biology Program, University of Washington, Seattle, WA 98195, USA.
Bioinformatics. 2022 May 13;38(10):2927-2929. doi: 10.1093/bioinformatics/btac165.
Use of PacBio sequencing for characterizing barcoded libraries of genetic variants is on the rise. However, current approaches in resolving PacBio sequencing artifacts can result in a high number of incorrectly identified or unusable reads. Here, we developed a PacBio Read Alignment Tool (PacRAT) that improves the accuracy of barcode-variant mapping through several steps of read alignment and consensus calling. To quantify the performance of our approach, we simulated PacBio reads from eight variant libraries of various lengths and showed that PacRAT improves the accuracy in pairing barcodes and variants across these libraries. Analysis of real (non-simulated) libraries also showed an increase in the number of reads that can be used for downstream analyses when using PacRAT.
PacRAT is written in Python and is freely available (https://github.com/dunhamlab/PacRAT).
Supplemental data are available at Bioinformatics online.
使用 PacBio 测序技术对带有条形码的遗传变异文库进行测序的方法越来越普及。然而,当前解决 PacBio 测序伪影的方法可能会导致大量的错误识别或无法使用的读取结果。在这里,我们开发了一种 PacBio 读取对齐工具(PacRAT),它通过几个读取对齐和共识调用步骤来提高条形码-变异映射的准确性。为了量化我们方法的性能,我们模拟了来自 8 个不同长度的变异文库的 PacBio 读取结果,结果表明 PacRAT 提高了在这些文库中配对条形码和变异的准确性。对真实(非模拟)文库的分析也表明,当使用 PacRAT 时,可用于下游分析的读取数量增加。
PacRAT 是用 Python 编写的,并且可以免费获得(https://github.com/dunhamlab/PacRAT)。
补充数据可在《生物信息学》在线获取。