Fonseca João, Liu Xiuyun, Oliveira Hélder P, Pereira Tania
Institute for Systems and Computer Engineering, Technology and Science, Porto, Portugal.
Department of Anesthesiology and Critical Care Medicine, Johns Hopkins University, Baltimore, MD, United States.
Front Neurol. 2022 Jun 10;13:859068. doi: 10.3389/fneur.2022.859068. eCollection 2022.
Traumatic Brain Injury (TBI) is one of the leading causes of injury related mortality in the world, with severe cases reaching mortality rates of 30-40%. It is highly heterogeneous both in causes and consequences, complicating medical interpretation and prognosis. Gathering clinical, demographic, and laboratory data to perform a prognosis requires time and skill in several clinical specialties. Machine learning (ML) methods can take advantage of the data and guide physicians toward a better prognosis and, consequently, better healthcare. The objective of this study was to develop and test a wide range of machine learning models and evaluate their capability of predicting mortality of TBI, at hospital discharge, while assessing the similarity between the predictive value of the data and clinical significance.
The used dataset is the Hackathon Pediatric Traumatic Brain Injury (HPTBI) dataset, composed of electronic health records containing clinical annotations and demographic data of 300 patients. Four different classification models were tested, either with or without feature selection. For each combination of the classification model and feature selection method, the area under the receiver operator curve (ROC-AUC), balanced accuracy, precision, and recall were calculated.
Methods based on decision trees perform better when using all features (Random Forest, AUC = 0.86 and XGBoost, AUC = 0.91) but other models require prior feature selection to obtain the best results (k-Nearest Neighbors, AUC = 0.90 and Artificial Neural Networks, AUC = 0.84). Additionally, Random Forest and XGBoost allow assessing the feature's importance, which could give insights for future strategies on the clinical routine.
Predictive capability depends greatly on the combination of model and feature selection methods used but, overall, ML models showed a very good performance in mortality prediction for TBI. The feature importance results indicate that predictive value is not directly related to clinical significance.
创伤性脑损伤(TBI)是全球与损伤相关的主要死亡原因之一,严重病例的死亡率高达30%-40%。其病因和后果具有高度异质性,使医学解释和预后变得复杂。收集临床、人口统计学和实验室数据以进行预后评估需要多个临床专业的时间和技能。机器学习(ML)方法可以利用这些数据,引导医生做出更好的预后判断,从而提供更好的医疗服务。本研究的目的是开发和测试多种机器学习模型,评估它们预测TBI患者出院时死亡率的能力,同时评估数据预测价值与临床意义之间的相似性。
使用的数据集是黑客马拉松儿科创伤性脑损伤(HPTBI)数据集,由包含300名患者临床注释和人口统计学数据的电子健康记录组成。测试了四种不同的分类模型,有无特征选择均可。对于分类模型和特征选择方法的每种组合,计算接收器操作曲线下面积(ROC-AUC)、平衡准确率、精确率和召回率。
使用所有特征时,基于决策树的方法表现更好(随机森林,AUC = 0.86;XGBoost,AUC = 0.91),但其他模型需要先进行特征选择才能获得最佳结果(k近邻,AUC = 0.90;人工神经网络,AUC = 0.84)。此外,随机森林和XGBoost允许评估特征的重要性,这可为未来临床常规策略提供见解。
预测能力在很大程度上取决于所使用的模型和特征选择方法的组合,但总体而言,ML模型在TBI死亡率预测方面表现出非常好的性能。特征重要性结果表明,预测价值与临床意义并不直接相关。