Department of Bioengineering, Rice University, Houston, TX, 77030, USA.
Department of Civil and Environmental Engineering, Rice University, Houston, TX, 77005, USA.
Nat Commun. 2024 Jul 26;15(1):6306. doi: 10.1038/s41467-024-49957-9.
Tiled amplicon sequencing has served as an essential tool for tracking the spread and evolution of pathogens. Over 15 million complete SARS-CoV-2 genomes are now publicly available, most sequenced and assembled via tiled amplicon sequencing. While computational tools for tiled amplicon design exist, they require downstream manual optimization both computationally and experimentally, which is slow and costly. Here we present Olivar, a first step towards a fully automated, variant-aware design of tiled amplicons for pathogen genomes. Olivar converts each nucleotide of the target genome into a numeric risk score, capturing undesired sequence features that should be avoided. In a direct comparison with PrimalScheme, we show that Olivar has fewer mismatches overlapping with primers and predicted PCR byproducts. We also compare Olivar head-to-head with ARTIC v4.1, the most widely used primer set for SARS-CoV-2 sequencing, and show Olivar yields similar read mapping rates (~90%) and better coverage to the manually designed ARTIC v4.1 amplicons. We also evaluate Olivar on real wastewater samples and found that Olivar has up to 3-fold higher mapping rates while retaining similar coverage. In summary, Olivar automates and accelerates the generation of tiled amplicons, even in situations of high mutation frequency and/or density. Olivar is available online as a web application at https://olivar.rice.edu and can be installed locally as a command line tool with Bioconda. Source code, installation guide, and usage are available at https://github.com/treangenlab/Olivar .
瓦片式扩增子测序已成为追踪病原体传播和进化的重要工具。现在,已有超过 1500 万份完整的 SARS-CoV-2 基因组可供公开使用,其中大多数是通过瓦片式扩增子测序进行测序和组装的。虽然存在用于瓦片式扩增子设计的计算工具,但它们需要在计算和实验上进行下游手动优化,这既缓慢又昂贵。在这里,我们介绍了 Olivar,这是朝着为病原体基因组的瓦片式扩增子进行全自动、变体感知设计迈出的第一步。Olivar 将目标基因组的每个核苷酸转换为数字风险评分,捕获应避免的不理想序列特征。在与 PrimalScheme 的直接比较中,我们表明 Olivar 与引物和预测的 PCR 副产物重叠的错配更少。我们还将 Olivar 与最广泛用于 SARS-CoV-2 测序的引物集 ARTIC v4.1 进行了头对头比较,并表明 Olivar 产生了类似的读映射率(~90%)和更好的覆盖度,达到了手动设计的 ARTIC v4.1 扩增子。我们还在实际的废水样本中评估了 Olivar,发现 Olivar 的映射率提高了 3 倍,同时保持了相似的覆盖度。总之,Olivar 实现了瓦片式扩增子的自动化和加速生成,即使在高频突变和/或高密度的情况下也是如此。Olivar 可在线作为网络应用程序在 https://olivar.rice.edu 上使用,也可以通过 Bioconda 在本地安装为命令行工具。源代码、安装指南和使用说明可在 https://github.com/treangenlab/Olivar 上找到。