• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

相似文献

1
Pacybara: accurate long-read sequencing for barcoded mutagenized allelic libraries.Pacybara:用于带条码诱变等位基因文库的准确长读测序。
Bioinformatics. 2024 Mar 29;40(4). doi: 10.1093/bioinformatics/btae182.
2
Pacybara: Accurate long-read sequencing for barcoded mutagenized allelic libraries.Pacybara:用于条形码诱变等位基因文库的精确长读长测序。
bioRxiv. 2023 Dec 7:2023.02.22.529427. doi: 10.1101/2023.02.22.529427.
3
PacRAT: a program to improve barcode-variant mapping from PacBio long reads using multiple sequence alignment.PacRAT:一种利用多重序列比对提高 PacBio 长读段中条码变异映射的程序。
Bioinformatics. 2022 May 13;38(10):2927-2929. doi: 10.1093/bioinformatics/btac165.
4
Impact of next-generation sequencing error on analysis of barcoded plasmid libraries of known complexity and sequence.新一代测序错误对已知复杂度和序列的条形码质粒文库分析的影响。
Nucleic Acids Res. 2014;42(16):e129. doi: 10.1093/nar/gku607. Epub 2014 Jul 10.
5
Designing robust watermark barcodes for multiplex long-read sequencing.为多重长读长测序设计稳健的水印条形码。
Bioinformatics. 2017 Mar 15;33(6):807-813. doi: 10.1093/bioinformatics/btw322.
6
Alignment-free clustering of UMI tagged DNA molecules.无比对聚类分析 UMI 标签化 DNA 分子。
Bioinformatics. 2019 Jun 1;35(11):1829-1836. doi: 10.1093/bioinformatics/bty888.
7
Enhancing the detection of barcoded reads in high throughput DNA sequencing data by controlling the false discovery rate.通过控制假发现率来提高高通量 DNA 测序数据中条码读取的检测能力。
BMC Bioinformatics. 2014 Aug 7;15(1):264. doi: 10.1186/1471-2105-15-264.
8
Bartender: a fast and accurate clustering algorithm to count barcode reads.酒保:一种快速准确的聚类算法,用于计算条码读取次数。
Bioinformatics. 2018 Mar 1;34(5):739-747. doi: 10.1093/bioinformatics/btx655.
9
A Workflow to Improve Variant Calling Accuracy in Molecular Barcoded Sequencing Reads.一种提高分子条形码测序读数中变异位点检测准确性的工作流程。
J Comput Biol. 2019 Jan;26(1):96-103. doi: 10.1089/cmb.2018.0110. Epub 2018 Aug 17.
10
TruSPAdes: barcode assembly of TruSeq synthetic long reads.TruSPAdes:TruSeq 合成长 reads 的条码组装。
Nat Methods. 2016 Mar;13(3):248-50. doi: 10.1038/nmeth.3737. Epub 2016 Feb 1.

引用本文的文献

1
Functional evidence for variant classification from mutational scanning.来自突变扫描的变异分类功能证据。
bioRxiv. 2025 Aug 15:2025.08.11.669723. doi: 10.1101/2025.08.11.669723.
2
Variant scoring tools for deep mutational scanning.用于深度突变扫描的变异评分工具。
Mol Syst Biol. 2025 Aug 8. doi: 10.1038/s44320-025-00137-x.
3
Image-based, pooled phenotyping reveals multidimensional, disease-specific variant effects.基于图像的汇总表型分析揭示了多维度、疾病特异性的变异效应。
bioRxiv. 2025 Jul 5:2025.07.03.663081. doi: 10.1101/2025.07.03.663081.
4
Scaled multidimensional assays of variant effect identify sequence-function relationships in hypertrophic cardiomyopathy.变异效应的规模化多维分析确定肥厚型心肌病中的序列-功能关系。
bioRxiv. 2025 May 27:2025.05.23.655878. doi: 10.1101/2025.05.23.655878.

本文引用的文献

1
Deep mutational scanning of CYP2C19 in human cells reveals a substrate specificity-abundance tradeoff.在人类细胞中对 CYP2C19 进行深度突变扫描揭示了底物特异性-丰度权衡。
Genetics. 2024 Nov 6;228(3). doi: 10.1093/genetics/iyae156.
2
DuBA.flow─A Low-Cost, Long-Read Amplicon Sequencing Workflow for the Validation of Synthetic DNA Constructs.DuBA.flow——一种用于验证合成 DNA 构建体的低成本、长读长扩增子测序工作流程。
ACS Synth Biol. 2024 Feb 16;13(2):457-465. doi: 10.1021/acssynbio.3c00522. Epub 2024 Jan 31.
3
Muscle5: High-accuracy alignment ensembles enable unbiased assessments of sequence homology and phylogeny.肌肉 5:高精度比对集合可实现序列同源性和系统发育的无偏评估。
Nat Commun. 2022 Nov 15;13(1):6968. doi: 10.1038/s41467-022-34630-w.
4
Scalable Functional Assays for the Interpretation of Human Genetic Variation.用于解读人类遗传变异的可扩展功能测定法。
Annu Rev Genet. 2022 Nov 30;56:441-465. doi: 10.1146/annurev-genet-072920-032107. Epub 2022 Sep 2.
5
PacRAT: a program to improve barcode-variant mapping from PacBio long reads using multiple sequence alignment.PacRAT:一种利用多重序列比对提高 PacBio 长读段中条码变异映射的程序。
Bioinformatics. 2022 May 13;38(10):2927-2929. doi: 10.1093/bioinformatics/btac165.
6
Massively parallel characterization of CYP2C9 variant enzyme activity and abundance.大规模平行表征 CYP2C9 变异酶的活性和丰度。
Am J Hum Genet. 2021 Sep 2;108(9):1735-1751. doi: 10.1016/j.ajhg.2021.07.001. Epub 2021 Jul 26.
7
High-accuracy long-read amplicon sequences using unique molecular identifiers with Nanopore or PacBio sequencing.使用独特分子标识符结合纳米孔或PacBio测序的高精度长读长扩增子序列。
Nat Methods. 2021 Feb;18(2):165-169. doi: 10.1038/s41592-020-01041-y. Epub 2021 Jan 11.
8
Multiplex assessment of protein variant abundance by massively parallel sequencing.通过大规模平行测序进行蛋白质变异体丰度的多重评估。
Nat Genet. 2018 Jun;50(6):874-882. doi: 10.1038/s41588-018-0122-z. Epub 2018 May 21.
9
A framework for exhaustively mapping functional missense variants.一个详尽映射功能错义变异的框架。
Mol Syst Biol. 2017 Dec 21;13(12):957. doi: 10.15252/msb.20177908.
10
Bartender: a fast and accurate clustering algorithm to count barcode reads.酒保:一种快速准确的聚类算法,用于计算条码读取次数。
Bioinformatics. 2018 Mar 1;34(5):739-747. doi: 10.1093/bioinformatics/btx655.

Pacybara:用于带条码诱变等位基因文库的准确长读测序。

Pacybara: accurate long-read sequencing for barcoded mutagenized allelic libraries.

机构信息

Lunenfeld-Tanenbaum Research Institute, Sinai Health, Toronto, ON M5G 1X5, Canada.

Donnelly Centre, University of Toronto, Toronto, ON M5S 3E1, Canada.

出版信息

Bioinformatics. 2024 Mar 29;40(4). doi: 10.1093/bioinformatics/btae182.

DOI:10.1093/bioinformatics/btae182
PMID:38569896
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11021806/
Abstract

MOTIVATION

Long-read sequencing technologies, an attractive solution for many applications, often suffer from higher error rates. Alignment of multiple reads can improve base-calling accuracy, but some applications, e.g. sequencing mutagenized libraries where multiple distinct clones differ by one or few variants, require the use of barcodes or unique molecular identifiers. Unfortunately, sequencing errors can interfere with correct barcode identification, and a given barcode sequence may be linked to multiple independent clones within a given library.

RESULTS

Here we focus on the target application of sequencing mutagenized libraries in the context of multiplexed assays of variant effects (MAVEs). MAVEs are increasingly used to create comprehensive genotype-phenotype maps that can aid clinical variant interpretation. Many MAVE methods use long-read sequencing of barcoded mutant libraries for accurate association of barcode with genotype. Existing long-read sequencing pipelines do not account for inaccurate sequencing or nonunique barcodes. Here, we describe Pacybara, which handles these issues by clustering long reads based on the similarities of (error-prone) barcodes while also detecting barcodes that have been associated with multiple genotypes. Pacybara also detects recombinant (chimeric) clones and reduces false positive indel calls. In three example applications, we show that Pacybara identifies and correctly resolves these issues.

AVAILABILITY AND IMPLEMENTATION

Pacybara, freely available at https://github.com/rothlab/pacybara, is implemented using R, Python, and bash for Linux. It runs on GNU/Linux HPC clusters via Slurm, PBS, or GridEngine schedulers. A single-machine simplex version is also available.

摘要

动机

长读测序技术在许多应用中是一种很有吸引力的解决方案,但通常存在较高的错误率。多序列比对可以提高碱基调用的准确性,但有些应用,例如对经过诱变的文库进行测序,其中多个不同的克隆仅相差一个或几个变体,就需要使用条形码或独特的分子标识符。不幸的是,测序错误可能会干扰正确的条形码识别,并且给定的条形码序列可能与给定文库中的多个独立克隆相关联。

结果

在这里,我们专注于诱变文库测序在变体效应多重分析(MAVE)中的目标应用。MAVE 越来越多地用于创建全面的基因型-表型图谱,以帮助临床变异解释。许多 MAVE 方法使用带有条形码的突变文库的长读测序来准确地将条形码与基因型关联。现有的长读测序管道没有考虑到不准确的测序或非唯一的条形码。在这里,我们描述了 Pacybara,它通过基于(易错)条形码的相似性对长读进行聚类,同时还检测与多个基因型相关联的条形码来处理这些问题。Pacybara 还检测重组(嵌合)克隆并减少假阳性插入缺失调用。在三个示例应用中,我们表明 Pacybara 可以识别和正确解决这些问题。

可用性和实现

Pacybara 可在 https://github.com/rothlab/pacybara 上免费获得,它使用 R、Python 和用于 Linux 的 bash 实现。它通过 Slurm、PBS 或 GridEngine 调度程序在 GNU/Linux HPC 群集上运行。也提供单机单线程版本。