Uematsu Masaaki, Baskin Jeremy M
Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY, 14853, USA.
Department of Chemistry and Chemical Biology, Cornell University, Ithaca, NY 14853, USA.
bioRxiv. 2025 Feb 13:2023.04.12.536413. doi: 10.1101/2023.04.12.536413.
Plasmid construction is central to life science research, and sequence verification is arguably its costliest step. Long-read sequencing has emerged as a competitor to Sanger sequencing, with the principal benefit that whole plasmids can be sequenced in a single run. Nevertheless, the current cost of nanopore sequencing is still prohibitive for routine sequencing during plasmid construction. We develop a computational approach termed Simple Algorithm for Very Efficient Multiplexing of Oxford Nanopore Experiments for You (SAVEMONEY) that guides researchers to mix multiple plasmids and subsequently computationally de-mixes the resultant sequences. SAVEMONEY defines optimal mixtures in a pre-survey step, and following sequencing, executes a post-analysis workflow involving sequence classification, alignment, and consensus determination. By using Bayesian analysis with prior probability of expected plasmid construction error rate, high-confidence sequences can be obtained for each plasmid in the mixture. Plasmids differing by as little as two bases can be mixed for submission as a single sample for nanopore sequencing, and routine multiplexing of even six plasmids per 180 reads can still maintain high accuracy of consensus sequencing. SAVEMONEY should further democratize whole-plasmid sequencing by nanopore and related technologies, driving down the effective cost of whole-plasmid sequencing to lower than that of a single Sanger sequencing run.
质粒构建是生命科学研究的核心,而序列验证可以说是其成本最高的步骤。长读长测序已成为桑格测序的竞争对手,其主要优势在于可以在一次运行中对整个质粒进行测序。然而,目前纳米孔测序的成本对于质粒构建过程中的常规测序来说仍然过高。我们开发了一种计算方法,称为“为您高效多重化牛津纳米孔实验的简单算法(SAVEMONEY)”,该方法指导研究人员混合多个质粒,随后通过计算对所得序列进行解混。SAVEMONEY在预调查步骤中定义最佳混合物,测序后,执行一个包括序列分类、比对和一致性确定的后分析工作流程。通过使用具有预期质粒构建错误率先验概率的贝叶斯分析,可以为混合物中的每个质粒获得高可信度序列。相差仅两个碱基的质粒可以混合作为单个样本提交进行纳米孔测序,甚至每180条读段对六个质粒进行常规多重化仍可保持高准确度的一致性测序。SAVEMONEY应进一步使纳米孔及相关技术的全质粒测序更加普及,将全质粒测序的有效成本降低到低于单次桑格测序运行的成本。