Suppr超能文献

PacRAT:一种利用多重序列比对提高 PacBio 长读段中条码变异映射的程序。

PacRAT: a program to improve barcode-variant mapping from PacBio long reads using multiple sequence alignment.

机构信息

Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA.

Molecular and Cellular Biology Program, University of Washington, Seattle, WA 98195, USA.

出版信息

Bioinformatics. 2022 May 13;38(10):2927-2929. doi: 10.1093/bioinformatics/btac165.

Abstract

SUMMARY

Use of PacBio sequencing for characterizing barcoded libraries of genetic variants is on the rise. However, current approaches in resolving PacBio sequencing artifacts can result in a high number of incorrectly identified or unusable reads. Here, we developed a PacBio Read Alignment Tool (PacRAT) that improves the accuracy of barcode-variant mapping through several steps of read alignment and consensus calling. To quantify the performance of our approach, we simulated PacBio reads from eight variant libraries of various lengths and showed that PacRAT improves the accuracy in pairing barcodes and variants across these libraries. Analysis of real (non-simulated) libraries also showed an increase in the number of reads that can be used for downstream analyses when using PacRAT.

AVAILABILITY AND IMPLEMENTATION

PacRAT is written in Python and is freely available (https://github.com/dunhamlab/PacRAT).

SUPPLEMENTARY INFORMATION

Supplemental data are available at Bioinformatics online.

摘要

摘要

使用 PacBio 测序技术对带有条形码的遗传变异文库进行测序的方法越来越普及。然而,当前解决 PacBio 测序伪影的方法可能会导致大量的错误识别或无法使用的读取结果。在这里,我们开发了一种 PacBio 读取对齐工具(PacRAT),它通过几个读取对齐和共识调用步骤来提高条形码-变异映射的准确性。为了量化我们方法的性能,我们模拟了来自 8 个不同长度的变异文库的 PacBio 读取结果,结果表明 PacRAT 提高了在这些文库中配对条形码和变异的准确性。对真实(非模拟)文库的分析也表明,当使用 PacRAT 时,可用于下游分析的读取数量增加。

可用性和实施情况

PacRAT 是用 Python 编写的,并且可以免费获得(https://github.com/dunhamlab/PacRAT)。

补充信息

补充数据可在《生物信息学》在线获取。

相似文献

5
Evaluation of tools for long read RNA-seq splice-aware alignment.长读 RNA-seq 剪接感知比对工具评估。
Bioinformatics. 2018 Mar 1;34(5):748-754. doi: 10.1093/bioinformatics/btx668.
6
Alignment-free clustering of UMI tagged DNA molecules.无比对聚类分析 UMI 标签化 DNA 分子。
Bioinformatics. 2019 Jun 1;35(11):1829-1836. doi: 10.1093/bioinformatics/bty888.
8
LRCstats, a tool for evaluating long reads correction methods.LRCstats,一种用于评估长读纠错方法的工具。
Bioinformatics. 2017 Nov 15;33(22):3652-3654. doi: 10.1093/bioinformatics/btx489.
10
Fast and SNP-aware short read alignment with SALT.基于 SALT 的快速 SNP 感知短读序列比对。
BMC Bioinformatics. 2021 Aug 25;22(Suppl 9):172. doi: 10.1186/s12859-021-04088-6.

本文引用的文献

1
4
Massively Parallel Assays and Quantitative Sequence-Function Relationships.大规模平行分析与定量序列功能关系。
Annu Rev Genomics Hum Genet. 2019 Aug 31;20:99-127. doi: 10.1146/annurev-genom-083118-014845. Epub 2019 May 15.
6
SimLoRD: Simulation of Long Read Data.SimLoRD:长读长数据模拟
Bioinformatics. 2016 Sep 1;32(17):2704-6. doi: 10.1093/bioinformatics/btw286. Epub 2016 May 10.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验