从初步尿液分析到决策支持：基于现实世界实验室数据的机器学习用于尿路感染预测

From Preliminary Urinalysis to Decision Support: Machine Learning for UTI Prediction in Real-World Laboratory Data.

作者信息

Sergounioti Athanasia, Rigas Dimitrios, Zoitopoulos Vassilios, Kalles Dimitrios

机构信息

Department of Laboratory Medicine, General Hospital of Amfissa, 33100 Amfissa, Greece.

Independent Researcher, 33100 Amfissa, Greece.

出版信息

J Pers Med. 2025 May 16;15(5):200. doi: 10.3390/jpm15050200.

DOI:10.3390/jpm15050200

PMID:40423071

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12113611/

Abstract

: Urinary tract infections (UTIs) are frequently diagnosed empirically, often leading to overtreatment and rising antimicrobial resistance. This study aimed to develop and evaluate machine learning (ML) models that predict urine culture outcomes using routine urinalysis and demographic data, supporting more targeted empirical antibiotic use. : A real-world dataset comprising 8065 urinalysis records from a hospital laboratory was used to train five ensemble ML models, including random forest, XGBoost (eXtreme gradient boosting), extra trees, voting classifier, and stacking classifier. Models were developed using 10-fold stratified cross-validation and assessed via clinically relevant metrics including specificity, sensitivity, likelihood ratios, and diagnostic odds ratios (DORs). To enhance screening utility, threshold optimization was applied to the best-performing model (XGBoost) using the Youden index. : XGBoost and random forest demonstrated the most balanced diagnostic profiles (AUROC: 0.819 and 0.791, respectively), with DORs exceeding 21. The voting and stacking classifiers achieved the highest specificity (>95%) and positive likelihood ratios (>10) but exhibited lower sensitivity. Feature importance analysis identified positive nitrites, white blood cell count, and specific gravity as key predictors. Threshold tuning of XGBoost improved sensitivity from 70.2% to 87.9% and reduced false negatives by 82%, with an associated NPV of 96.4%. The adjusted model reduced overtreatment by 56% compared to empirical prescribing. : ML models based on structured urinalysis and demographic data can support clinical decision-making for UTIs. While high-specificity models may reduce unnecessary antibiotic use, sensitivity trade-offs must be considered. Threshold-optimized XGBoost offers a clinically adaptable tool for empirical treatment decisions by improving sensitivity and reducing overtreatment, thus supporting the more personalized and judicious use of antibiotics.

摘要

尿路感染（UTIs）通常是凭经验诊断的，这常常导致过度治疗以及抗菌药物耐药性的上升。本研究旨在开发和评估利用常规尿液分析和人口统计学数据预测尿培养结果的机器学习（ML）模型，以支持更有针对性的经验性抗生素使用。：使用来自医院实验室的8065条尿液分析记录的真实世界数据集来训练五个集成ML模型，包括随机森林、XGBoost（极端梯度提升）、额外树、投票分类器和堆叠分类器。使用10折分层交叉验证开发模型，并通过包括特异性、敏感性、似然比和诊断比值比（DORs）在内的临床相关指标进行评估。为了提高筛查效用，使用尤登指数对表现最佳的模型（XGBoost）进行阈值优化。：XGBoost和随机森林表现出最平衡的诊断特征（曲线下面积分别为0.819和0.791），DORs超过21。投票和堆叠分类器实现了最高的特异性（>95%）和阳性似然比（>10），但敏感性较低。特征重要性分析确定阳性亚硝酸盐、白细胞计数和比重为关键预测因子。XGBoost的阈值调整将敏感性从70.2%提高到87.9%，假阴性减少了82%，阴性预测值为96.4%。与经验性处方相比，调整后的模型将过度治疗减少了56%。：基于结构化尿液分析和人口统计学数据的ML模型可以支持UTIs的临床决策。虽然高特异性模型可能会减少不必要的抗生素使用，但必须考虑敏感性的权衡。阈值优化的XGBoost通过提高敏感性和减少过度治疗，为经验性治疗决策提供了一种临床适用的工具，从而支持更个性化和明智地使用抗生素。