Suppr超能文献

在鸟枪法宏基因组学数据集中检测病原体序列时处理假阳性

Managing false positives during detection of pathogen sequences in shotgun metagenomics datasets.

作者信息

Bradford Lauren M, Carrillo Catherine, Wong Alex

机构信息

Department of Biology, Carleton University, Ottawa, Canada.

Ottawa Laboratory (Carling), Canadian Food Inspection Agency, Ottawa, Canada.

出版信息

BMC Bioinformatics. 2024 Dec 3;25(1):372. doi: 10.1186/s12859-024-05952-x.

Abstract

BACKGROUND

Culture-independent diagnostic tests are gaining popularity as tools for detecting pathogens in food. Shotgun sequencing holds substantial promise for food testing as it provides abundant information on microbial communities, but the challenge is in analyzing large and complex sequencing datasets with a high degree of both sensitivity and specificity. Falsely classifying sequencing reads as originating from pathogens can lead to unnecessary food recalls or production shutdowns, while low sensitivity resulting in false negatives could lead to preventable illness.

RESULTS

We used simulated and published shotgun sequencing datasets containing Salmonella-derived reads to explore the appearance and mitigation of false positive results using the popular taxonomic annotation softwares Kraken2 and Metaphlan4. Using default parameters, Kraken2 is sensitive but prone to false positives, while Metaphlan4 is more specific but unable to detect Salmonella at low abundance. We then developed a bioinformatic pipeline for identifying and removing reads falsely identified as Salmonella by Kraken2 while retaining high sensitivity. Carefully considering software parameters and database choices is essential to avoiding false positive sample calls. With well-chosen parameters plus additional steps to confirm the taxonomic origin of reads, it is possible to detect pathogens with very high specificity and sensitivity.

摘要

背景

非培养诊断测试作为检测食品中病原体的工具正越来越受欢迎。鸟枪法测序在食品检测方面具有巨大潜力,因为它能提供关于微生物群落的丰富信息,但挑战在于以高度的敏感性和特异性分析庞大而复杂的测序数据集。将测序读数错误地分类为源自病原体可能导致不必要的食品召回或生产停工,而低敏感性导致的假阴性可能导致可预防的疾病。

结果

我们使用了包含源自沙门氏菌读数的模拟和已发表的鸟枪法测序数据集,以探索使用流行的分类注释软件Kraken2和Metaphlan4时假阳性结果的出现及缓解方法。使用默认参数时,Kraken2敏感但容易出现假阳性,而Metaphlan4更具特异性,但无法检测到低丰度的沙门氏菌。然后,我们开发了一种生物信息学流程,用于识别和去除被Kraken2错误识别为沙门氏菌的读数,同时保持高敏感性。仔细考虑软件参数和数据库选择对于避免假阳性样本调用至关重要。通过精心选择参数并加上额外步骤来确认读数的分类来源,可以以非常高的特异性和敏感性检测病原体。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/262a/11613480/3b2ea7f1ee37/12859_2024_5952_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验