German Federal Institute for Risk Assessment (BfR), Berlin, Germany.
German Federal Institute for Risk Assessment (BfR), Berlin, Germany
Appl Environ Microbiol. 2020 Feb 18;86(5). doi: 10.1128/AEM.02265-19.
We compared the performance of four open-source typing tools (SeqSero, SeqSero2, Typing Resource [SISTR], and Metric Oriented Sequence Typer [MOST]) to assess their potential for replacing laboratory serological testing with serovar predictions from whole-genome sequencing data. We conducted a retrospective analysis of 1,624 isolates of 72 serovars submitted to the German National Salmonella Reference Laboratory between 1999 and 2019. All isolates are derived from animal and foodstuff origins. We conducted Illumina short-read sequencing and compared the serovar prediction results with the results of routine laboratory serotyping. We found the best-performing serovar prediction tool to be SISTR, with 94% correctly typed isolates, followed by SeqSero2 (87%), SeqSero (81%), and MOST (79%). Furthermore, we found that mapping-based tools like SeqSero and SeqSero2 (allele mode) were more reliable for the prediction of monophasic variants, while sequence type and cluster-based methods like MOST and SISTR (core-genome multilocus sequence type [cgMLST]), showed greater resilience when confronted with GC-biased sequencing data. We showed that the choice of library preparation kit could substantially affect O antigen detection, due to the low GC content of the and genes. Although the accuracy of computational serovar predictions is still not quite on par with traditional serotyping by reference laboratories, the command-line tools investigated in this study perform a rapid, efficient, inexpensive, and reproducible analysis, which can be integrated into in-house characterization pipelines. Based on our results, we find SISTR most suitable for automated, routine serotyping for public health surveillance of spp. are important foodborne pathogens. To reduce the number of infected patients, it is essential to understand which subtypes of the bacteria cause disease outbreaks. Traditionally, characterization of requires serological testing, a laboratory method by which isolates can be classified into over 2,600 distinct subtypes, called serovars. Due to recent advances in whole-genome sequencing, many tools have been developed to replace traditional testing methods with computational analysis of genome sequences. It is crucial to validate that these tools, many already in use for routine surveillance, deliver accurate and reliable serovar information. In this study, we set out to compare which of the currently available open-source command-line tools is most suitable to replace serological testing. A thorough evaluation of the differing computational approaches is highly important to ensure the backward compatibility of serotyping data and to maintain comparability between laboratories.
我们比较了四种开源打字工具(SeqSero、SeqSero2、Typing Resource [SISTR]和Metric Oriented Sequence Typer [MOST])的性能,以评估它们在将血清型预测从全基因组测序数据替代实验室血清学检测方面的潜力。我们对 1999 年至 2019 年间提交给德国国家沙门氏菌参考实验室的 72 个血清型的 1624 个分离株进行了回顾性分析。所有分离株均来自动物和食品来源。我们进行了 Illumina 短读测序,并将血清型预测结果与常规实验室血清分型结果进行了比较。我们发现表现最好的血清型预测工具是 SISTR,其正确分型的分离株比例为 94%,其次是 SeqSero2(87%)、SeqSero(81%)和 MOST(79%)。此外,我们发现基于映射的工具,如 SeqSero 和 SeqSero2(等位基因模式),更可靠地预测单相变体,而基于序列类型和聚类的方法,如 MOST 和 SISTR(核心基因组多位点序列类型[cgMLST]),在面对 GC 偏向性测序数据时表现出更大的弹性。我们发现,由于 O 抗原基因的 GC 含量较低,文库制备试剂盒的选择会极大地影响 O 抗原的检测。尽管计算血清型预测的准确性仍不及传统的参考实验室血清分型,但本研究中调查的命令行工具可快速、高效、廉价且可重复地进行分析,可整合到内部特征分析管道中。基于我们的结果,我们发现 SISTR 最适合用于公共卫生监测的自动化、常规血清分型,因为 spp. 是重要的食源性致病菌。为了减少感染患者的数量,了解哪些细菌亚型引起疾病爆发至关重要。传统上,沙门氏菌的特征描述需要血清学检测,这是一种实验室方法,可以将分离株分为 2600 多种不同的亚型,称为血清型。由于全基因组测序的最新进展,许多工具已被开发出来,用基因组序列的计算分析来替代传统的测试方法。验证这些工具(其中许多已用于常规监测)提供准确可靠的血清型信息至关重要。在这项研究中,我们着手比较当前可用的开源命令行工具中最适合替代血清学检测的工具。对不同计算方法的全面评估对于确保血清分型数据的向后兼容性以及保持实验室之间的可比性非常重要。