European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, UK.
DeepSeq, School of Life Sciences, Queen's Medical Centre, University of Nottingham, Nottingham, UK.
Nat Biotechnol. 2023 Jul;41(7):1018-1025. doi: 10.1038/s41587-022-01580-z. Epub 2023 Jan 2.
Nanopore sequencers can select which DNA molecules to sequence, rejecting a molecule after analysis of a small initial part. Currently, selection is based on predetermined regions of interest that remain constant throughout an experiment. Sequencing efforts, thus, cannot be re-focused on molecules likely contributing most to experimental success. Here we present BOSS-RUNS, an algorithmic framework and software to generate dynamically updated decision strategies. We quantify uncertainty at each genome position with real-time updates from data already observed. For each DNA fragment, we decide whether the expected decrease in uncertainty that it would provide warrants fully sequencing it, thus optimizing information gain. BOSS-RUNS mitigates coverage bias between and within members of a microbial community, leading to improved variant calling; for example, low-coverage sites of a species at 1% abundance were reduced by 87.5%, with 12.5% more single-nucleotide polymorphisms detected. Such data-driven updates to molecule selection are applicable to many sequencing scenarios, such as enriching for regions with increased divergence or low coverage, reducing time-to-answer.
纳米孔测序仪可以选择要测序的 DNA 分子,在分析一小部分初始部分后拒绝一个分子。目前,选择是基于在整个实验中保持不变的预定感兴趣区域。因此,测序工作无法重新集中在最有可能对实验成功做出贡献的分子上。在这里,我们提出了 BOSS-RUNS,这是一种算法框架和软件,用于生成动态更新的决策策略。我们使用已经观察到的数据实时更新来量化每个基因组位置的不确定性。对于每个 DNA 片段,我们决定是否值得完全测序它,从而优化信息增益,因为它可以提供预期的不确定性降低。BOSS-RUNS 减轻了微生物群落成员之间和内部的覆盖偏差,从而提高了变异调用的准确性;例如,丰度为 1%的物种的低覆盖率位点减少了 87.5%,检测到的单核苷酸多态性增加了 12.5%。这种针对分子选择的数据驱动更新适用于许多测序场景,例如富集具有更高差异或低覆盖率的区域,从而减少回答时间。