Garnica Oscar, Gómez Diego, Ramos Víctor, Hidalgo J Ignacio, Ruiz-Giardín José M
Departamento de Arquitectura de Computadores, Universidad Complutense de Madrid, Madrid, Spain.
Universidad Complutense de Madrid, Madrid, Spain.
EPMA J. 2021 Aug 31;12(3):365-381. doi: 10.1007/s13167-021-00252-3. eCollection 2021 Sep.
The bacteraemia prediction is relevant because sepsis is one of the most important causes of morbidity and mortality. Bacteraemia prognosis primarily depends on a rapid diagnosis. The bacteraemia prediction would shorten up to 6 days the diagnosis, and, in conjunction with individual patient variables, should be considered to start the early administration of personalised antibiotic treatment and medical services, the election of specific diagnostic techniques and the determination of additional treatments, such as surgery, that would prevent subsequent complications. Machine learning techniques could help physicians make these informed decisions by predicting bacteraemia using the data already available in electronic hospital records.
This study presents the application of machine learning techniques to these records to predict the blood culture's outcome, which would reduce the lag in starting a personalised antibiotic treatment and the medical costs associated with erroneous treatments due to conservative assumptions about blood culture outcomes.
Six supervised classifiers were created using three machine learning techniques, Support Vector Machine, Random Forest and K-Nearest Neighbours, on the electronic health records of hospital patients. The best approach to handle missing data was chosen and, for each machine learning technique, two classification models were created: the first uses the features known at the time of blood extraction, whereas the second uses four extra features revealed during the blood culture.
The six classifiers were trained and tested using a dataset of 4357 patients with 117 features per patient. The models obtain predictions that, for the best case, are up to a state-of-the-art accuracy of 85.9%, a sensitivity of 87.4% and an AUC of 0.93.
Our results provide cutting-edge metrics of interest in predictive medical models with values that exceed the medical practice threshold and previous results in the literature using classical modelling techniques in specific types of bacteraemia. Additionally, the consistency of results is reasserted because the three classifiers' importance ranking shows similar features that coincide with those that physicians use in their manual heuristics. Therefore, the efficacy of these machine learning techniques confirms their viability to assist in the aims of predictive and personalised medicine once the disease presents bacteraemia-compatible symptoms and to assist in improving the healthcare economy.
菌血症预测具有重要意义,因为脓毒症是发病和死亡的最重要原因之一。菌血症的预后主要取决于快速诊断。菌血症预测可将诊断时间缩短多达6天,并且结合个体患者变量,应考虑开始早期给予个性化抗生素治疗和医疗服务、选择特定诊断技术以及确定额外治疗(如手术),以预防后续并发症。机器学习技术可以通过利用电子医院记录中已有的数据预测菌血症,帮助医生做出这些明智的决策。
本研究展示了将机器学习技术应用于这些记录以预测血培养结果,这将减少开始个性化抗生素治疗的延迟以及因对血培养结果的保守假设导致的错误治疗相关的医疗成本。
使用支持向量机、随机森林和K近邻三种机器学习技术,基于医院患者的电子健康记录创建了六个监督分类器。选择了处理缺失数据的最佳方法,并且针对每种机器学习技术创建了两个分类模型:第一个使用采血时已知的特征,而第二个使用血培养期间揭示的四个额外特征。
使用包含每位患者117个特征的4357例患者数据集对六个分类器进行了训练和测试。这些模型获得的预测结果,在最佳情况下,达到了高达85.9%的最先进准确率、87.4%的灵敏度和0.93的AUC。
我们的结果提供了预测医学模型中具有前沿意义的指标,其值超过了医学实践阈值以及先前文献中使用经典建模技术针对特定类型菌血症的结果。此外,结果的一致性得到了重申,因为三个分类器的重要性排名显示出相似的特征,与医生在手动启发式方法中使用的特征一致。因此,这些机器学习技术的有效性证实了它们在疾病出现菌血症兼容症状后协助实现预测性和个性化医学目标以及改善医疗保健经济方面的可行性。