Ben-Haim Gal, Yosef Mika, Rowand Eyade, Ben-Yosef Jonathan, Berman Aya, Sina Sigal, Halabi Nitsan, Grossbard Eitan, Marziano Yehonatan, Segal Gad
Emergency Department, Chaim Sheba Medical Center, Ramat-Gan, Israel.
The Faculty of Medicine, Tel-Aviv University, Tel-Aviv, Israel.
Digit Health. 2024 Sep 12;10:20552076241277673. doi: 10.1177/20552076241277673. eCollection 2024 Jan-Dec.
Prompt diagnosis of bacteremia in the emergency department (ED) is of utmost importance. Nevertheless, the average time to first clinical laboratory finding range from 1 to 3 days. Alongside a myriad of scoring systems for occult bacteremia prediction, efforts for applying artificial intelligence (AI) in this realm are still preliminary. In the current study we combined an AI algorithm with a Natural Language Processing (NLP) algorithm that would potentially increase the yield extracted from clinical ED data.
This study involved adult patients who visited our emergency department and at least one blood culture was taken to rule out bacteremia. Using both tabular and free text data, we built an ensemble model that leverages XGBoost for structured data, and logistic regression (LR) on a word-analysis technique called bag-of-words (BOW) Term Frequency-Inverse Document Frequency (TF-IDF), for textual data. All algorithms were designed in order to predict the risk for bacteremia with ED patients whose blood cultures were sent to the laboratory.
The study cohort comprised 94,482 individuals, of whom 52% were males. The prevalence of bacteremia in the entire cohort was 9.7%. The model trained on the tabular data yielded an area under the curve (AUC) of 73.7% for XGBoost, while the LR that was trained on the free text achieved an AUC of 71.3%. After checking a range of weights, the best combination was for 55% weight on the XGBoost prediction and 45% weight on the LR prediction. The final model prediction yielded an AUC of 75.6%.
Harnessing artificial intelligence to the task of bacteremia surveillance in the ED settings by a combination of both free text and tabular data analysis improved predictive performance compared to using tabular data alone. We recommend that future AI applications based on our findings should be assimilated into the clinical routines of ED physicians.
在急诊科(ED)快速诊断菌血症至关重要。然而,首次临床实验室检查结果的平均时间为1至3天。除了众多用于预测隐匿性菌血症的评分系统外,在这一领域应用人工智能(AI)的努力仍处于初步阶段。在本研究中,我们将一种AI算法与一种自然语言处理(NLP)算法相结合,这可能会提高从急诊临床数据中提取的信息。
本研究纳入了就诊于我们急诊科且至少进行了一次血培养以排除菌血症的成年患者。利用表格数据和自由文本数据,我们构建了一个集成模型,该模型利用XGBoost处理结构化数据,并在一种名为词袋(BOW)词频 - 逆文档频率(TF - IDF)的词分析技术上使用逻辑回归(LR)处理文本数据。所有算法的设计目的都是预测血培养已送检实验室的急诊患者发生菌血症的风险。
研究队列包括94482人,其中52%为男性。整个队列中菌血症的患病率为9.7%。在表格数据上训练的模型中,XGBoost的曲线下面积(AUC)为73.7%,而在自由文本上训练的LR的AUC为71.3%。在检查了一系列权重后,最佳组合是XGBoost预测权重为55%,LR预测权重为45%。最终模型预测的AUC为75.6%。
与仅使用表格数据相比,通过结合自由文本和表格数据分析,在急诊环境中利用人工智能进行菌血症监测可提高预测性能。我们建议,基于我们的研究结果,未来的人工智能应用应融入急诊医生的临床常规工作中。