通过纠正条形码处理偏差来提高批量健身分析的准确性。

Improving the Accuracy of Bulk Fitness Assays by Correcting Barcode Processing Biases.

机构信息

Department of Physics, Washington University, St. Louis, MO, USA.

Department of Bioengineering, University of Pennsylvania, Philadelphia, PA, USA.

出版信息

Mol Biol Evol. 2024 Aug 2;41(8). doi: 10.1093/molbev/msae152.

DOI:10.1093/molbev/msae152

PMID:39041198

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11316221/

Abstract

Measuring the fitnesses of genetic variants is a fundamental objective in evolutionary biology. A standard approach for measuring microbial fitnesses in bulk involves labeling a library of genetic variants with unique sequence barcodes, competing the labeled strains in batch culture, and using deep sequencing to track changes in the barcode abundances over time. However, idiosyncratic properties of barcodes can induce nonuniform amplification or uneven sequencing coverage that causes some barcodes to be over- or under-represented in samples. This systematic bias can result in erroneous read count trajectories and misestimates of fitness. Here, we develop a computational method, named REBAR (Removing the Effects of Bias through Analysis of Residuals), for inferring the effects of barcode processing bias by leveraging the structure of systematic deviations in the data. We illustrate this approach by applying it to two independent data sets, and demonstrate that this method estimates and corrects for bias more accurately than standard proxies, such as GC-based corrections. REBAR mitigates bias and improves fitness estimates in high-throughput assays without introducing additional complexity to the experimental protocols, with potential applications in a range of experimental evolution and mutation screening contexts.

摘要

衡量遗传变异体的适合度是进化生物学的基本目标。一种用于批量测量微生物适合度的标准方法涉及用唯一的序列条形码标记遗传变异文库，在分批培养中竞争标记的菌株，并使用深度测序来跟踪随时间推移条形码丰度的变化。然而，条形码的特殊性质会导致非均匀扩增或不均匀的测序覆盖，从而导致某些条形码在样本中被过度或低估。这种系统偏差会导致错误的读取计数轨迹和对适合度的错误估计。在这里，我们开发了一种名为 REBAR（通过分析残差去除偏差的影响）的计算方法，通过利用数据中系统偏差的结构来推断条形码处理偏差的影响。我们通过将其应用于两个独立的数据集来说明这种方法，并证明该方法比标准代理（例如基于 GC 的校正）更准确地估计和校正偏差。REBAR 减轻了偏差并提高了高通量测定中的适合度估计，而不会给实验方案带来额外的复杂性，具有广泛的实验进化和突变筛选背景下的潜在应用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b846/11316221/fdbb28623702/msae152f1.jpg

相似文献

Improving the Accuracy of Bulk Fitness Assays by Correcting Barcode Processing Biases.通过纠正条形码处理偏差来提高批量健身分析的准确性。

Mol Biol Evol. 2024 Aug 2;41(8). doi: 10.1093/molbev/msae152.

Unbiased Fitness Estimation of Pooled Barcode or Amplicon Sequencing Studies.无偏适应池条形码或扩增子测序研究的估计。

Cell Syst. 2018 Nov 28;7(5):521-525.e4. doi: 10.1016/j.cels.2018.09.004. Epub 2018 Nov 1.

BARCOSEL: a tool for selecting an optimal barcode set for high-throughput sequencing.BARCOSEL：一种用于为高通量测序选择最佳条码集的工具。

BMC Bioinformatics. 2018 Jul 5;19(1):257. doi: 10.1186/s12859-018-2262-7.

Indel-correcting DNA barcodes for high-throughput sequencing.高通量测序的无错切 DNA 条形码。

Proc Natl Acad Sci U S A. 2018 Jul 3;115(27):E6217-E6226. doi: 10.1073/pnas.1802640115. Epub 2018 Jun 20.

Best Practices in Designing, Sequencing, and Identifying Random DNA Barcodes.设计、测序和鉴定随机 DNA 条码的最佳实践。

J Mol Evol. 2023 Jun;91(3):263-280. doi: 10.1007/s00239-022-10083-z. Epub 2023 Jan 18.

Levenshtein error-correcting barcodes for multiplexed DNA sequencing.莱文斯坦纠错条码在多重 DNA 测序中的应用。

BMC Bioinformatics. 2013 Sep 11;14:272. doi: 10.1186/1471-2105-14-272.

Digital RNA sequencing minimizes sequence-dependent bias and amplification noise with optimized single-molecule barcodes.数字 RNA 测序通过优化的单分子条形码最小化了序列依赖性偏差和扩增噪声。

Proc Natl Acad Sci U S A. 2012 Jan 24;109(4):1347-52. doi: 10.1073/pnas.1118018109. Epub 2012 Jan 9.

Click Chemistry-Based DNA Labeling of Cells for Barcoding Applications.基于点击化学的细胞 DNA 标记用于条形码应用。

Bioconjug Chem. 2018 Aug 15;29(8):2846-2854. doi: 10.1021/acs.bioconjchem.8b00435. Epub 2018 Aug 3.

Next-generation DNA barcoding: using next-generation sequencing to enhance and accelerate DNA barcode capture from single specimens.下一代DNA条形码技术：利用下一代测序技术增强并加速从单个样本中捕获DNA条形码。

Mol Ecol Resour. 2014 Sep;14(5):892-901. doi: 10.1111/1755-0998.12236. Epub 2014 Feb 19.

Insertion and deletion correcting DNA barcodes based on watermarks.基于水印的插入和缺失校正DNA条形码

BMC Bioinformatics. 2015 Feb 18;16:50. doi: 10.1186/s12859-015-0482-7.

本文引用的文献

Evolution of haploid and diploid populations reveals common, strong, and variable pleiotropic effects in non-home environments.在非原生环境中，单倍体和二倍体种群的进化揭示了常见、强大且多变的多效性影响。

Elife. 2023 Oct 20;12:e92899. doi: 10.7554/eLife.92899.

Best Practices in Designing, Sequencing, and Identifying Random DNA Barcodes.设计、测序和鉴定随机 DNA 条码的最佳实践。

J Mol Evol. 2023 Jun;91(3):263-280. doi: 10.1007/s00239-022-10083-z. Epub 2023 Jan 18.

Fitness variation across subtle environmental perturbations reveals local modularity and global pleiotropy of adaptation.适应在微妙的环境干扰下的变化揭示了局部模块性和全局多效性。

Elife. 2020 Dec 2;9:e61271. doi: 10.7554/eLife.61271.

Biological fitness landscapes by deep mutational scanning.深度突变扫描的生物适应性景观。

Methods Enzymol. 2020;643:203-224. doi: 10.1016/bs.mie.2020.04.023. Epub 2020 May 5.

Genomic GC-Content Affects the Accuracy of 16S rRNA Gene Sequencing Based Microbial Profiling due to PCR Bias.由于PCR偏差，基因组GC含量影响基于16S rRNA基因测序的微生物谱分析的准确性。

Front Microbiol. 2017 Oct 5;8:1934. doi: 10.3389/fmicb.2017.01934. eCollection 2017.

Limitations and challenges of genetic barcode quantification.遗传条码定量的局限性和挑战。

Sci Rep. 2017 Mar 3;7:43249. doi: 10.1038/srep43249.

Development of a Comprehensive Genotype-to-Fitness Map of Adaptation-Driving Mutations in Yeast.酵母中适应性驱动突变的综合基因型-适合度图谱的构建

Cell. 2016 Sep 8;166(6):1585-1596.e22. doi: 10.1016/j.cell.2016.08.002. Epub 2016 Sep 1.

A Comparison of Methods to Measure Fitness in Escherichia coli.测量大肠杆菌适应性的方法比较

PLoS One. 2015 May 11;10(5):e0126210. doi: 10.1371/journal.pone.0126210. eCollection 2015.

Quantitative evolutionary dynamics using high-resolution lineage tracking.使用高分辨率谱系追踪的定量进化动力学

Nature. 2015 Mar 12;519(7542):181-6. doi: 10.1038/nature14279. Epub 2015 Feb 25.

Summarizing and correcting the GC content bias in high-throughput sequencing.高通量测序中 GC 含量偏倚的总结与校正。

Nucleic Acids Res. 2012 May;40(10):e72. doi: 10.1093/nar/gks001. Epub 2012 Feb 9.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

通过纠正条形码处理偏差来提高批量健身分析的准确性。

Improving the Accuracy of Bulk Fitness Assays by Correcting Barcode Processing Biases.

机构信息

出版信息

相似文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

本文引用的文献