Fu Chia-Ming, Ngo Ike, Lau Pak Sheung, Ivanchuk Yaroslav, Chou Fan-Ya, Wang Chih-Hung, Lin Chien-Yu, Tsai Chu-Lin, Chen Shey-Ying, Lu Tsung-Chien, Wei Hung-Yu
National Taiwan University Hospital, Department of Emergency Medicine, Taipei City, Taiwan.
Min-Sheng General Hospital, Department of Emergency Medicine, Taoyuan City, Taiwan.
West J Emerg Med. 2025 May 30;26(3):617-626. doi: 10.5811/westjem.35866.
Bacteremia, a common disease but difficult to diagnose early, may result in significant morbidity and mortality without prompt treatment. We aimed to develop machine-learning (ML) algorithms to predict patients with bacteremia from febrile patients presenting to the emergency department (ED) using data that is readily available at the triage.
We included all adult patients (≥18 years of age) who presented to the emergency department (ED) of National Taiwan University Hospital (NTUH), a tertiary teaching hospital in Taiwan, with the chief complaint of fever or measured body temperature more than 38°C, and who received at least one blood culture during the ED encounter. We extracted data from the Integrated Medical Database of NTUH from 2009-2018.The dataset included patient demographics, triage details, symptoms, and medical history. The positive blood culture result of at least one potential pathogen was defined as bacteremia and used as the binary classification label. We split the dataset into training/validation and testing sets (60-to-40 ratio) and trained five supervised ML models using K-fold cross-validation. The model performance was evaluated using the area under the receiver operating characteristic curve (AUC) in the testing set.
We included 80,201 cases in this study. Of them, 48120 cases were assigned to the training/validation set and 32,081 to the testing set. Bacteremia was identified in 5,831 (12.1%) and 3,824 (11.9%) cases of the training/validation set and test set, respectively. All ML models performed well, with CatBoost achieving the highest AUC (.844, 95% confidence interval [CI] .837-.850), followed by extreme gradient boosting (.843, 95% CI .836-.849), gradient boosting (.842, 95% CI .836-.849), light gradient boosting machine (.841, 95% CI .834-.847), and random forest (.828, 95% CI .821-.834).
Our machine-learning model has shown excellent discriminatory performance to predict bacteremia based only on clinical features at ED triage. It has the potential to improve care quality and save more lives if successfully implemented in the ED.
菌血症是一种常见疾病,但早期难以诊断,若不及时治疗可能导致严重的发病率和死亡率。我们旨在开发机器学习(ML)算法,利用分诊时 readily available 的数据,从前往急诊科(ED)就诊的发热患者中预测菌血症患者。
我们纳入了所有前往台湾大学附属医院(NTUH)急诊科就诊的成年患者(≥18岁),这些患者的主要诉求为发热或测量体温超过38°C,且在急诊科就诊期间接受了至少一次血培养。我们从NTUH的综合医疗数据库中提取了2009 - 2018年的数据。数据集包括患者人口统计学信息、分诊细节、症状和病史。至少一种潜在病原体的血培养阳性结果被定义为菌血症,并用作二元分类标签。我们将数据集分为训练/验证集和测试集(60比40的比例),并使用K折交叉验证训练了五个监督式ML模型。在测试集中使用受试者操作特征曲线(AUC)下的面积评估模型性能。
我们在本研究中纳入了80,201例病例。其中,48,120例被分配到训练/验证集,32,081例被分配到测试集。训练/验证集和测试集分别有5,831例(12.1%)和3,824例(11.9%)病例被诊断为菌血症。所有ML模型表现良好,CatBoost的AUC最高(.844,95%置信区间[CI].837 -.850),其次是极端梯度提升(.843,95% CI.836 -.849)、梯度提升(.842,95% CI.836 -.849)、轻梯度提升机(.841,95% CI.834 -.847)和随机森林(.828,95% CI.821 -.834)。
我们的机器学习模型仅基于急诊科分诊时的临床特征,在预测菌血症方面表现出了出色的鉴别性能。如果在急诊科成功实施,它有可能提高护理质量并挽救更多生命。