Institute of Biology, Federal University of Bahia (UFBA), Salvador, Bahia, 41745-715, Brazil.
National Institute for Interdisciplinary Transdisciplinary Studies in Ecology and Evolution (IN-TREE), Salvador, Brazil.
BMC Genomics. 2024 Sep 12;25(1):856. doi: 10.1186/s12864-024-10778-1.
The expansion of sequencing technologies as a result of the response to the COVID-19 pandemic enabled pathogen (meta)genomics to be deployed as a routine component of surveillance in many countries. Scaling genomic surveillance, however, comes with associated costs in both equipment and sequencing reagents, which should be optimized. Here, we evaluate the cost efficiency and performance of different read lengths in identifying pathogens in metagenomic samples. We carefully evaluated performance metrics, costs, and time requirements relative to choices of 75, 150 and 300 base pairs (bp) read lengths in pathogen identification.
Our findings revealed that moving from 75 bp to 150 bp read length approximately doubles both the cost and sequencing time. Opting for 300 bp reads leads to approximately two- and three-fold increases, respectively, in cost and sequencing time compared to 75 bp reads. For viral pathogen detection, the sensitivity median ranged from 99% with 75 bp reads to 100% with 150-300 bp reads. However, bacterial pathogens detection was less effective with shorter reads: 87% with 75 bp, 95% with 150 bp, and 97% with 300 bp reads. These findings were consistent across different levels of taxa abundance. The precision of pathogen detection using shorter reads was comparable to that of longer reads across most viral and bacterial taxa.
During disease outbreak situations, when swift responses are required for pathogen identification, we suggest prioritizing 75 bp read lengths, especially if detection of viral pathogens is aimed. This practical approach allows better use of resources, enabling the sequencing of more samples using streamlined workflows, while maintaining a reliable response capability.
由于对 COVID-19 大流行的应对,测序技术得到了扩展,使得病原体(宏)基因组学能够作为许多国家监测的常规组成部分。然而,基因组监测的扩展伴随着设备和测序试剂相关成本的增加,这些成本应该得到优化。在这里,我们评估了不同读长在鉴定宏基因组样本中的病原体时的成本效益和性能。我们仔细评估了性能指标、成本和时间要求,相对于选择 75、150 和 300 个碱基对(bp)读长进行病原体鉴定。
我们的研究结果表明,从 75 bp 读长变为 150 bp 读长,成本和测序时间大约翻了一番。选择 300 bp 读长与 75 bp 读长相比,成本和测序时间分别增加了约两倍和三倍。对于病毒病原体检测,75 bp 读长的灵敏度中位数为 99%,150-300 bp 读长的灵敏度中位数为 100%。然而,较短的读长对细菌病原体的检测效果较差:75 bp 读长为 87%,150 bp 读长为 95%,300 bp 读长为 97%。这些发现在不同丰度水平的分类群中是一致的。在大多数病毒和细菌分类群中,使用较短读长进行病原体检测的精度与使用较长读长的精度相当。
在疾病爆发情况下,需要快速进行病原体鉴定时,我们建议优先选择 75 bp 读长,特别是如果目标是检测病毒病原体。这种实用的方法可以更好地利用资源,通过简化的工作流程,对更多的样本进行测序,同时保持可靠的响应能力。