Bradford Lauren M, Carrillo Catherine, Wong Alex
Department of Biology, Carleton University, Ottawa, Canada.
Ottawa Laboratory (Carling), Canadian Food Inspection Agency, Ottawa, Canada.
BMC Bioinformatics. 2024 Dec 3;25(1):372. doi: 10.1186/s12859-024-05952-x.
Culture-independent diagnostic tests are gaining popularity as tools for detecting pathogens in food. Shotgun sequencing holds substantial promise for food testing as it provides abundant information on microbial communities, but the challenge is in analyzing large and complex sequencing datasets with a high degree of both sensitivity and specificity. Falsely classifying sequencing reads as originating from pathogens can lead to unnecessary food recalls or production shutdowns, while low sensitivity resulting in false negatives could lead to preventable illness.
We used simulated and published shotgun sequencing datasets containing Salmonella-derived reads to explore the appearance and mitigation of false positive results using the popular taxonomic annotation softwares Kraken2 and Metaphlan4. Using default parameters, Kraken2 is sensitive but prone to false positives, while Metaphlan4 is more specific but unable to detect Salmonella at low abundance. We then developed a bioinformatic pipeline for identifying and removing reads falsely identified as Salmonella by Kraken2 while retaining high sensitivity. Carefully considering software parameters and database choices is essential to avoiding false positive sample calls. With well-chosen parameters plus additional steps to confirm the taxonomic origin of reads, it is possible to detect pathogens with very high specificity and sensitivity.
非培养诊断测试作为检测食品中病原体的工具正越来越受欢迎。鸟枪法测序在食品检测方面具有巨大潜力,因为它能提供关于微生物群落的丰富信息,但挑战在于以高度的敏感性和特异性分析庞大而复杂的测序数据集。将测序读数错误地分类为源自病原体可能导致不必要的食品召回或生产停工,而低敏感性导致的假阴性可能导致可预防的疾病。
我们使用了包含源自沙门氏菌读数的模拟和已发表的鸟枪法测序数据集,以探索使用流行的分类注释软件Kraken2和Metaphlan4时假阳性结果的出现及缓解方法。使用默认参数时,Kraken2敏感但容易出现假阳性,而Metaphlan4更具特异性,但无法检测到低丰度的沙门氏菌。然后,我们开发了一种生物信息学流程,用于识别和去除被Kraken2错误识别为沙门氏菌的读数,同时保持高敏感性。仔细考虑软件参数和数据库选择对于避免假阳性样本调用至关重要。通过精心选择参数并加上额外步骤来确认读数的分类来源,可以以非常高的特异性和敏感性检测病原体。