• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

利用下一代测序数据和机器学习改进微生物风险评估中的危害特征描述:预测志贺毒素产生性大肠杆菌的临床结果。

Improving hazard characterization in microbial risk assessment using next generation sequencing data and machine learning: Predicting clinical outcomes in shigatoxigenic Escherichia coli.

机构信息

Research Group for Genomic Epidemiology, National Food Institute, Technical University of Denmark, Kemitorvet, Building 204, 2800 Kgs. Lyngby, Denmark.

Research Group for Genomic Epidemiology, National Food Institute, Technical University of Denmark, Kemitorvet, Building 204, 2800 Kgs. Lyngby, Denmark.

出版信息

Int J Food Microbiol. 2019 Mar 2;292:72-82. doi: 10.1016/j.ijfoodmicro.2018.11.016. Epub 2018 Dec 4.

DOI:10.1016/j.ijfoodmicro.2018.11.016
PMID:30579059
Abstract

The ever decreasing cost and increase in throughput of next generation sequencing (NGS) techniques have resulted in a rapid increase in availability of NGS data. Such data have the potential for rapid, reproducible and highly discriminative characterization of pathogens. This provides an opportunity in microbial risk assessment to account for variations in survivability and virulence among strains. A major challenge towards such attempts remains the highly dimensional nature of genomic data versus the number of isolates. Machine learning-based (ML) predictive risk modelling provides a solution to this "curse of dimensionality" while accounting for individual effects that are dependent on interactions with other genetic and environmental factors. This pilot study explores the potential of ML in the prediction of health endpoints resulting from shigatoxigenic E. coli (STEC) infection. Accessory genes in amino acid sequences were used as model input to predict and differentiate health outcomes in STEC infections including diarrhea, bloody diarrhea, hemolytic uremic syndrome and their combinations. Outcomes severity was also distinguished by hospitalization. A matrix of percent similarity between accessory genes and the E. coli genomes was generated and subsequently used as input for ML. The performances of ML algorithms random forest, support vector machine (radial and linear kernel), gradient boosting, and logit boost were compared. Logit boost was the best model showing an outcome prediction accuracy of 0.75 (95% CI: 0.60, 0.86), an excellent or substantial performance (Kappa = 0.72). Important genetic predictors of riskier STEC clinical outcomes included proteins involved in initial attachment to the host cell, persistence of plasmids or genomic islands, conjugative plasmid transfer and formation of sex pili, regulation of locus of enterocyte effacement expression, post-translational acetylation of proteins, facilitation of the rearrangement or deletion of sections within the pathogenic islands and transport macromolecules across the cell envelope. We propose further studies are proposed on the proteins with undefined or unclear functionality. One protein family in particular predicted HUS outcome. Toxin-antitoxin systems are potential stress adaptation markers which may mediate environmental persistence of strains in diverse sources. We foresee the application of ML approach to the set-up of real-time online analysis of whole genome sequence data to estimate the human health risk at the population or strain level. The ML approach is envisaged to support the prediction of more specific STEC clinical endpoints type by inputting isolate sequence data.

摘要

下一代测序 (NGS) 技术的成本不断降低和通量不断增加,导致 NGS 数据的可用性迅速增加。此类数据有可能快速、可重复且高度区分病原体。这为微生物风险评估提供了一个机会,可以考虑菌株之间生存能力和毒力的差异。一个主要挑战仍然是基因组数据的高维性质与分离株数量之间的关系。基于机器学习 (ML) 的预测风险建模为解决这一“维度诅咒”提供了一种解决方案,同时考虑了依赖于与其他遗传和环境因素相互作用的个体效应。这项初步研究探讨了 ML 在预测产志贺毒素大肠杆菌 (STEC) 感染所致健康终点方面的潜力。氨基酸序列中的辅助基因被用作模型输入,以预测和区分 STEC 感染中的健康结果,包括腹泻、血性腹泻、溶血尿毒综合征及其组合。严重程度也通过住院来区分。生成了辅助基因与大肠杆菌基因组之间的相似度矩阵,随后将其用作 ML 的输入。比较了 ML 算法随机森林、支持向量机(径向和线性核)、梯度提升和对数提升的性能。对数提升是最好的模型,其结果预测准确率为 0.75(95%CI:0.60,0.86),表现出色或非常好(Kappa=0.72)。风险较高的 STEC 临床结果的重要遗传预测因子包括与宿主细胞初始附着、质粒或基因组岛的持久性、可共轭质粒转移和性菌毛形成、肠上皮细胞 effacement 表达调控、蛋白质的翻译后乙酰化、促进致病性岛内部分的重排或缺失以及跨细胞膜转运大分子有关的蛋白质。我们建议对功能尚不清楚或不清楚的蛋白质进行进一步研究。预测 HUS 结果的一个蛋白质家族尤其受到关注。毒素-抗毒素系统是潜在的应激适应标记物,可能介导菌株在不同来源中的环境持久性。我们预计 ML 方法将应用于建立全基因组序列数据的实时在线分析,以估计人群或菌株水平的人类健康风险。预计 ML 方法将通过输入分离株序列数据来支持对更具体的 STEC 临床终点类型的预测。

相似文献

1
Improving hazard characterization in microbial risk assessment using next generation sequencing data and machine learning: Predicting clinical outcomes in shigatoxigenic Escherichia coli.利用下一代测序数据和机器学习改进微生物风险评估中的危害特征描述:预测志贺毒素产生性大肠杆菌的临床结果。
Int J Food Microbiol. 2019 Mar 2;292:72-82. doi: 10.1016/j.ijfoodmicro.2018.11.016. Epub 2018 Dec 4.
2
Characterization of a novel plasmid encoding F4-like fimbriae present in a Shiga-toxin producing enterotoxigenic Escherichia coli isolated during the investigation on a case of hemolytic-uremic syndrome.鉴定一株产志贺毒素肠毒性大肠杆菌中新的 F4 样菌毛的特性,该菌是在对溶血尿毒综合征病例调查中分离得到的。
Int J Med Microbiol. 2018 Oct;308(7):947-955. doi: 10.1016/j.ijmm.2018.07.002. Epub 2018 Jul 11.
3
Molecular characterization and phylogeny of Shiga toxin-producing Escherichia coli isolates obtained from two Dutch regions using whole genome sequencing.使用全基因组测序对从荷兰两个地区分离的产志贺毒素大肠杆菌进行分子特征分析和系统发育分析。
Clin Microbiol Infect. 2016 Jul;22(7):642.e1-9. doi: 10.1016/j.cmi.2016.03.028. Epub 2016 Apr 4.
4
Comparative Genomics and Characterization of Hybrid Shigatoxigenic and Enterotoxigenic Escherichia coli (STEC/ETEC) Strains.产志贺毒素大肠杆菌与产肠毒素大肠杆菌杂交菌株的比较基因组学及特性分析
PLoS One. 2015 Aug 27;10(8):e0135936. doi: 10.1371/journal.pone.0135936. eCollection 2015.
5
Population Analysis of O26 Shiga Toxin-Producing Causing Hemolytic Uremic Syndrome in Italy, 1989-2020, Through Whole Genome Sequencing.通过全基因组测序对 1989 年至 2020 年意大利产 O26 志贺毒素引起的溶血性尿毒综合征的人群分析。
Front Cell Infect Microbiol. 2022 Feb 9;12:842508. doi: 10.3389/fcimb.2022.842508. eCollection 2022.
6
Prevalence and genetic characteristics of Shigatoxigenic Escherichia coli from patients with diarrhoea in Maasailand, Kenya.肯尼亚马赛兰地区腹泻患者中产志贺毒素大肠杆菌的流行情况及遗传特征
J Infect Dev Ctries. 2012 Feb 13;6(2):102-8. doi: 10.3855/jidc.1750.
7
Whole-Genome-Based Public Health Surveillance of Less Common Shiga Toxin-Producing Escherichia coli Serovars and Untypeable Strains Identifies Four Novel O Genotypes.基于全基因组的少见产志贺毒素大肠杆菌血清型和不可分型菌株的公共卫生监测鉴定出四个新型 O 基因型。
J Clin Microbiol. 2019 Sep 24;57(10). doi: 10.1128/JCM.00768-19. Print 2019 Oct.
8
Characterization of Shiga-toxigenic Escherichia coli isolated from cases of diarrhoea & haemolytic uremic syndrome in north India.从印度北部腹泻和溶血尿毒综合征病例中分离出的产志贺毒素大肠杆菌的特性分析。
Indian J Med Res. 2014 Dec;140(6):778-84.
9
Molecular Characterization of Shiga Toxin-Producing Escherichia coli Strains Isolated in Poland.波兰分离的产志贺毒素大肠杆菌菌株的分子特征
Pol J Microbiol. 2016 Aug 26;65(3):261-269. doi: 10.5604/17331331.1215601.
10
First Isolation of the Heteropathotype Shiga Toxin-Producing and Extra-Intestinal Pathogenic (STEC-ExPEC) O80:H2 in French Healthy Cattle: Genomic Characterization and Phylogenetic Position.首次从法国健康牛中分离出产志贺毒素和肠外致病性(STEC-ExPEC)O80:H2 异型体:基因组特征和系统发育位置。
Int J Mol Sci. 2024 May 16;25(10):5428. doi: 10.3390/ijms25105428.

引用本文的文献

1
Transmission pathways of Campylobacter jejuni between humans and livestock in rural Ethiopia are highly complex and interdependent.在埃塞俄比亚农村地区,空肠弯曲菌在人类和牲畜之间的传播途径极为复杂且相互依存。
Gut Pathog. 2025 May 3;17(1):26. doi: 10.1186/s13099-025-00691-7.
2
Source attribution of human infection: a multi-country model in the European Union.人类感染的来源归因:欧盟的多国模型
Front Microbiol. 2025 Feb 5;16:1519189. doi: 10.3389/fmicb.2025.1519189. eCollection 2025.
3
Artificial intelligence applications in the diagnosis and treatment of bacterial infections.
人工智能在细菌感染诊断与治疗中的应用。
Front Microbiol. 2024 Aug 6;15:1449844. doi: 10.3389/fmicb.2024.1449844. eCollection 2024.
4
Advancements in Predictive Microbiology: Integrating New Technologies for Efficient Food Safety Models.预测微生物学的进展:整合新技术以构建高效食品安全模型
Int J Microbiol. 2024 May 17;2024:6612162. doi: 10.1155/2024/6612162. eCollection 2024.
5
Development and validation of a random forest algorithm for source attribution of animal and human Typhimurium and monophasic variants of Typhimurium isolates in England and Wales utilising whole genome sequencing data.利用全基因组测序数据开发并验证一种随机森林算法,用于英格兰和威尔士动物及人类鼠伤寒沙门氏菌以及鼠伤寒沙门氏菌单相变体分离株的溯源分析。
Front Microbiol. 2024 Mar 12;14:1254860. doi: 10.3389/fmicb.2023.1254860. eCollection 2023.
6
The potential application of artificial intelligence in veterinary clinical practice and biomedical research.人工智能在兽医临床实践和生物医学研究中的潜在应用。
Front Vet Sci. 2024 Jan 31;11:1347550. doi: 10.3389/fvets.2024.1347550. eCollection 2024.
7
Advances in machine learning-based bacteria analysis for forensic identification: identity, ethnicity, and site of occurrence.基于机器学习的法医鉴定细菌分析进展:身份、种族和发生地点。
Front Microbiol. 2023 Dec 21;14:1332857. doi: 10.3389/fmicb.2023.1332857. eCollection 2023.
8
Machine learning to predict foodborne salmonellosis outbreaks based on genome characteristics and meteorological trends.基于基因组特征和气象趋势的机器学习预测食源性沙门氏菌病暴发情况
Curr Res Food Sci. 2023 May 28;6:100525. doi: 10.1016/j.crfs.2023.100525. eCollection 2023.
9
Comparison of Source Attribution Methodologies for Human Campylobacteriosis.人类弯曲杆菌病源归因方法的比较
Pathogens. 2023 May 31;12(6):786. doi: 10.3390/pathogens12060786.
10
Combination of whole genome sequencing and supervised machine learning provides unambiguous identification of -positive Shiga toxin-producing .全基因组测序与监督式机器学习相结合可明确鉴定产志贺毒素阳性菌。
Front Microbiol. 2023 May 12;14:1118158. doi: 10.3389/fmicb.2023.1118158. eCollection 2023.