Yıldırım Mehmet, Sezik Savaş, Başar Ayşe
Computer Engineering Department, Boğaziçi University, Istanbul, Türkiye.
Emergency Medicine, Odemis State Hospital, Izmir, Türkiye.
J Comput Biol. 2025 Jun;32(6):584-600. doi: 10.1089/cmb.2024.0632. Epub 2025 May 22.
Accurate triage in emergency rooms is crucial for efficient patient care and resource allocation. We developed methods to predict triage levels using several traditional machine learning methods (logistic regression, random forest, XGBoost) and neural network deep learning-based approaches. These models were tested on a dataset from emergency department visits of patients at a local Turkish hospital; this dataset consists of both structured and unstructured data. Compared with previous work, our challenge was to build a predictive model that uses documents written in the Turkish language and that handles specific aspects of the Turkish medical system. Text embedding techniques such as Bag of Words, Word2Vec, and BERT-based embedding were used to process the unstructured patient complaints. We used a comprehensive set of features including patient history data and disease diagnosis within our predictive models, which included advanced neural network architectures such as convolutional neural networks, attention mechanisms, and long-short-term memory networks. Our results revealed that BERT embeddings significantly enhanced the performance of neural network models, while Word2Vec embeddings showed slight better results in traditional machine learning models. The most effective model was XGBoost combined with Word2Vec embeddings, achieving 86.7% AUC, 81.5% accuracy, and 68.7% weighted F1 score. We conclude that text embedding methods and machine learning methods are effective tools to predict emergency room triage levels. The integration of patient history into the models, alongside the strategic use of text embeddings, significantly improves predictive accuracy.
急诊室的准确分诊对于高效的患者护理和资源分配至关重要。我们开发了使用几种传统机器学习方法(逻辑回归、随机森林、XGBoost)和基于神经网络的深度学习方法来预测分诊级别的方法。这些模型在土耳其当地一家医院的急诊科患者就诊数据集上进行了测试;该数据集包含结构化和非结构化数据。与之前的工作相比,我们面临的挑战是构建一个使用土耳其语编写的文档并处理土耳其医疗系统特定方面的预测模型。使用了词袋模型、Word2Vec和基于BERT的嵌入等文本嵌入技术来处理非结构化的患者投诉。我们在预测模型中使用了包括患者病史数据和疾病诊断在内的一整套特征,这些模型包括卷积神经网络、注意力机制和长短期记忆网络等先进的神经网络架构。我们的结果表明,BERT嵌入显著提高了神经网络模型的性能,而Word2Vec嵌入在传统机器学习模型中显示出稍好的结果。最有效的模型是XGBoost与Word2Vec嵌入相结合,实现了86.7%的曲线下面积(AUC)、81.5%的准确率和68.7%的加权F1分数。我们得出结论,文本嵌入方法和机器学习方法是预测急诊室分诊级别的有效工具。将患者病史纳入模型,并战略性地使用文本嵌入,可显著提高预测准确性。