Bronner Iraad F, Dawson Emma, Park Naomi, Piepenburg Olaf, Quail Michael A
Wellcome Sanger Institute (WT), Hinxton, United Kingdom.
Front Genet. 2025 Jan 6;15:1505839. doi: 10.3389/fgene.2024.1505839. eCollection 2024.
The Darwin Tree of Life (DToL) project aims to generate high-quality reference genomes for all eukaryotic organisms in Britain and Ireland. At the time of writing, PacBio HiFi reads are generated for all samples using the Sequel IIe systems by the Wellcome Sanger Institute's Scientific Operations teams, however we expect lessons from this work to apply directly to the Revio system too, as core principles of SMRT sequencing remain the same. We observed that HiFi yield is highly variable for DToL samples. We have investigated what drives this variation, and potential mitigations. To support these investigations a number of controls were evaluated to ensure that the library and sequencing preparation procedures, reagents, consumables, and Sequel IIe instruments, were performing as expected. Our findings support that a primary factor driving variability in HiFi yield is the quality of the DNA prior to library construction, e.g., purity, size, and damage. We investigated whether quality assessment assays could link measurable DNA damage or purity to sequencing yield. Some correlation could be established, however no assay was predictive of sequencing yield for all samples, indicating that the variability is driven by multiple factors that may interact. We demonstrate that contaminants present in some samples are the cause of very low HiFi yield, and show that these contaminants can negatively affect the PacBio internal sequencing control and samples multiplexed on the same SMRT Cell. We found that consistently high yields could be obtained if an amplification workflow was utilised, namely PacBio's ultra-low input library preparation protocol.
达尔文生命之树(DToL)项目旨在为英国和爱尔兰的所有真核生物生成高质量的参考基因组。在撰写本文时,维康桑格研究所的科学运营团队使用Sequel IIe系统为所有样本生成了PacBio HiFi reads,不过我们预计这项工作中的经验教训也将直接适用于Revio系统,因为SMRT测序的核心原理保持不变。我们观察到达尔文生命之树项目样本的HiFi产量高度可变。我们研究了导致这种变化的因素以及可能的缓解措施。为支持这些研究,评估了一些对照,以确保文库和测序制备程序、试剂、耗材以及Sequel IIe仪器按预期运行。我们的研究结果支持,驱动HiFi产量变化的一个主要因素是文库构建前DNA的质量,例如纯度、大小和损伤情况。我们研究了质量评估检测方法是否能将可测量的DNA损伤或纯度与测序产量联系起来。虽然可以建立一些相关性,但没有一种检测方法能预测所有样本的测序产量,这表明这种变异性是由多个可能相互作用的因素驱动的。我们证明了一些样本中存在的污染物是HiFi产量极低的原因,并表明这些污染物会对PacBio内部测序对照以及在同一SMRT Cell上多重化的样本产生负面影响。我们发现,如果采用一种扩增工作流程,即PacBio的超低输入文库制备方案,就可以持续获得高产率。