Konganti Kranti, Reed Elizabeth, Mammel Mark, Kayikcioglu Tunc, Binet Rachel, Jarvis Karen, Ferreira Christina M, Bell Rebecca L, Zheng Jie, Windsor Amanda M, Ottesen Andrea, Grim Christopher J, Ramachandran Padmini
Center for Food Safety and Applied Nutrition, U.S. Food and Drug Administration, College Park, MD, United States.
Center for Veterinary Medicine, U.S. Food and Drug Administration, Laurel, MD, United States.
Front Microbiol. 2023 Aug 2;14:1200983. doi: 10.3389/fmicb.2023.1200983. eCollection 2023.
Most current subtyping analyses rely on whole genome sequencing (WGS), which focuses on the high-resolution analysis of single genomes or multiple single genomes from the isolated colonies on microbiological agar plates. In this study, we introduce bioinformatics innovations for a metagenomic outbreak response workflow that accurately identifies multiple serovars at the same time. bettercallsal is one of the first analysis tools to identify multiple serotypes from metagenomic or quasi-metagenomic datasets with high accuracy, allowing these isolate-independent methods to be incorporated into surveillance and root cause investigations. It was tested on an benchmark dataset comprising 29 unique serovars, 46 non- bacterial genomes, and 10 viral genomes at varying read depths and on previously well-characterized and sequenced non-selective primary and selective enrichments of papaya and peach samples from separate outbreak investigations that resulted in the identification of multiple serovars using traditional isolate culturing and WGS as well as nucleic acid assays. Analyses were also conducted on these datasets using a custom-built tool, SeqSero2, and Kallisto to compare serotype calling to bettercallsal. The dataset analyzed with bettercallsal achieved the maximum precision, recall, and accuracy of 100, 83, and 94%, respectively. In the papaya outbreak samples, bettercallsal identified the presence of multiple serovars in agreement with the Luminex xMAP assay results and also identified more serovars per sample, as evidenced by NCBI SNP clustering. In peach outbreak samples, bettercallsal identified two serovars in concordance with -mer analysis and the Luminex xMAP assay. The genome hit reported by bettercallsal clustered with the chicken isolate genome, as reported by the FDA peach outbreak investigation from sequenced isolates (WGS). Overall, bettercallsal outperformed , Seqsero2, and Kallisto in identifying multiple serovars from enrichment cultures using shotgun metagenomic sequencing.
目前大多数亚型分析依赖于全基因组测序(WGS),该方法侧重于对微生物琼脂平板上分离菌落的单个基因组或多个单个基因组进行高分辨率分析。在本研究中,我们为宏基因组疫情应对工作流程引入了生物信息学创新方法,可同时准确识别多种血清型。bettercallsal是首批能够从宏基因组或准宏基因组数据集中高精度识别多种血清型的分析工具之一,使这些不依赖分离株的方法能够纳入监测和根本原因调查。该工具在一个基准数据集上进行了测试,该数据集包含29种独特血清型、46个非细菌基因组和10个病毒基因组,测序深度各不相同,同时还对木瓜和桃子样本的先前已充分表征和测序的非选择性原始样本以及选择性富集样本进行了测试,这些样本来自不同的疫情调查,通过传统的分离株培养、WGS以及核酸检测鉴定出了多种血清型。我们还使用定制工具SeqSero2和Kallisto对这些数据集进行了分析,以将血清型鉴定结果与bettercallsal进行比较。使用bettercallsal分析的数据集分别实现了100%、83%和94%的最大精度、召回率和准确率。在木瓜疫情样本中,bettercallsal鉴定出多种血清型的存在,与Luminex xMAP检测结果一致,并且每个样本鉴定出更多血清型,NCBI SNP聚类证明了这一点。在桃子疫情样本中,bettercallsal鉴定出两种血清型,与-mer分析和Luminex xMAP检测结果一致。bettercallsal报告的基因组命中结果与鸡分离株基因组聚类,正如美国食品药品监督管理局(FDA)对桃子疫情调查中测序分离株(WGS)所报告的那样。总体而言,在使用鸟枪法宏基因组测序从富集培养物中鉴定多种血清型方面,bettercallsal优于Seqsero2和Kallisto。