Feng Shuyi, Ramachandran Padmini, Blaustein Ryan A, Pradhan Abani K
Department of Nutrition and Food Science, University of Maryland, College Park, MD, United States.
Human Foods Program U.S. Food and Drug Administration, College Park, MD, United States.
Front Microbiol. 2025 Mar 19;16:1549260. doi: 10.3389/fmicb.2025.1549260. eCollection 2025.
is the leading cause of illnesses and outbreaks linked to seafood consumption across the globe. Understanding how this pathogen may be adapted to persist along the farm-to-table supply chain has applications for addressing food safety. This study utilized machine learning to develop robust models classifying genomic diversity of that was isolated from environmental ( = 176), seafood ( = 975), and clinical ( = 865) sample origins. We constructed a pangenome of the respective genome assemblies and employed random forest algorithm to develop predictive models to identify gene clusters encoding metabolism, virulence, and antibiotic resistance that were associated with isolate source type. Comparison of genomes of all seafood-clinical isolates showed high balanced accuracy (≥0.80) and Area Under the Receiver Operating Characteristics curve (≥0.87) for all of these functional features. Major virulence factors including , , type III secretion system-related genes, and four alpha-hemolysin genes (, , , and ) were identified as important differentiating factors in our seafood-clinical virulence model, underscoring the need for further investigation. Significant patterns for AMR genes differing among seafood and clinical samples were revealed from our model and genes conferring to tetracycline, elfamycin, and multidrug (phenicol antibiotic, diaminopyrimidine antibiotic, and fluoroquinolone antibiotic) resistance were identified as the top three key variables. These findings provide crucial insights into the development of effective surveillance and management strategies to address the public health threats associated with .
是全球与海鲜消费相关的疾病和疫情爆发的主要原因。了解这种病原体如何适应在从农场到餐桌的供应链中持续存在,对于解决食品安全问题具有重要意义。本研究利用机器学习开发了强大的模型,对从环境样本(n = 176)、海鲜样本(n = 975)和临床样本(n = 865)中分离出的[病原体名称未给出]的基因组多样性进行分类。我们构建了各个基因组组装体的泛基因组,并采用随机森林算法开发预测模型,以识别与分离源类型相关的编码代谢、毒力和抗生素抗性的基因簇。所有海鲜 - 临床分离株的基因组比较显示,所有这些功能特征的平衡准确率都很高(≥0.80),受试者工作特征曲线下面积也很高(≥0.87)。主要毒力因子包括[毒力因子名称未给出]、[毒力因子名称未给出]、III型分泌系统相关基因以及四个α - 溶血素基因([基因名称未给出]、[基因名称未给出]、[基因名称未给出]和[基因名称未给出])被确定为我们的海鲜 - 临床毒力模型中的重要区分因素,这突出了进一步研究的必要性。我们的模型揭示了海鲜和临床样本中抗生素抗性基因的显著差异模式,赋予四环素、利福霉素和多药(甲砜霉素抗生素、二氨基嘧啶抗生素和氟喹诺酮抗生素)抗性的基因被确定为前三个关键变量。这些发现为制定有效的监测和管理策略以应对与[病原体名称未给出]相关的公共卫生威胁提供了关键见解。