Pan Yuling, Wei Mengqi, Jin Mengyuan, Liang Ying, Yi Tianjiao, Tu Jiancheng, Wu Shimin, Hu Fang, Liang Chunzi
School of Laboratory Medicine, Hubei University of Chinese Medicine, 16 Huangjia Lake West Road, Wuhan, 430065, China.
Hubei Shizhen Laboratory, Hubei University of Chinese Medicine, 16 Huangjia Lake West Road, Wuhan, 430065, China.
EClinicalMedicine. 2025 Apr 3;82:103192. doi: 10.1016/j.eclinm.2025.103192. eCollection 2025 Apr.
Minor head trauma is a frequent cause of emergency department visits, early identification and prediction of mild traumatic brain injury (mTBI) patients with abnormal brain lesions are vital for minimizing unnecessary computed tomography (CT) scans, reducing radiation exposure, and ensuring timely effective treatment and care. This study aims to develop and validate an interpretable machine learning (ML) prediction model using routine laboratory data for guiding clinical decisions on CT scan use in mTBI patients.
We conducted a multicentre study in China including data from January 2019 to July 2024. Our study included three patient cohorts: a retrospective training cohort (654 patients for training and 163 for internal testing) and two prospective validation cohorts (86 internal and 290 external patients). Fifty-one routine clinical laboratory characteristics, readily available from the electronic medical record (EMR) system within the first 24 h of admission, were collected. Seven ML algorithms were trained to develop predictive models, with the random forest (RF) algorithm used to optimize key feature combinations. Model predictive performance was evaluated using metrics such as the area under the receiver operating characteristic curve (AUC), positive predictive value (PPV), and F1 scores. The SHapley Additive exPlanation (SHAP) was applied to interpret the final model, while decision curve analysis (DCA) was used to assess the clinical net benefit.
In the derivation cohort, 599 (73.3%) patients had normal CT scans and 218 (26.7%) had abnormal CT scans. The Gradient boosting classifier (GBC) model performed best among the seven ML models, with an AUC of 0.932 (95% CI: 0.900-0.963). After reducing features to 21 (8 biochemical test indicators, 3 coagulation markers, and 10 complete blood cell count indicators) according to feature importance rank, an explainable GBC-final model was established. The final model accurately predicted mTBI patients with abnormal CT in both internal (AUC 0.926, 95% CI: 0.893-0.958) and external (AUC 0.904, 95% CI: 0.835-0.973) validation cohorts. In the prospective cohort, final GBC model achieved AUC of 0.885 (95% CI: 0.753-1.000) and was significantly superior to traditional TBI biomarkers GFAP (AUC: 0.745) and PGP9.5 (AUC: 0.794). DCA revealed that the final model offered greater net benefits than "full intervention" or "no intervention" strategies within a probability threshold range of 0.16-0.93. SHAP analysis identified D-dimer levels, absolute lymphocyte and neutrophil counts, and hematocrit as key high-risk features.
Our optimal feature selection-based ML model accurately and reliably predicts CT abnormalities in mTBI patients using routine test data. By addressing clinicians' concerns regarding transparency and decision-making through SHAP and DCA analyses, we strengthen the potential clinical applicability of our ML model.
The Natural Science Foundation of Hubei Province, high-level Talent Research Startup Funding of Hubei University of Chinese Medicine, Wuhan Health and Family Planning Scientific Research Fund Project of Hubei Province, and Machine Learning-based Intelligent Diagnosis System for AFP-negative Liver Cancer Project.
轻度头部外伤是急诊科就诊的常见原因,早期识别和预测脑损伤异常的轻度创伤性脑损伤(mTBI)患者对于减少不必要的计算机断层扫描(CT)、降低辐射暴露以及确保及时有效的治疗和护理至关重要。本研究旨在开发并验证一种可解释的机器学习(ML)预测模型,该模型使用常规实验室数据来指导mTBI患者CT扫描使用的临床决策。
我们在中国进行了一项多中心研究,纳入了2019年1月至2024年7月的数据。我们的研究包括三个患者队列:一个回顾性训练队列(654例用于训练,163例用于内部测试)和两个前瞻性验证队列(86例内部患者和290例外部患者)。收集了入院后24小时内可从电子病历(EMR)系统中轻松获取的51项常规临床实验室特征。训练了七种ML算法来开发预测模型,使用随机森林(RF)算法优化关键特征组合。使用受试者操作特征曲线下面积(AUC)、阳性预测值(PPV)和F1分数等指标评估模型预测性能。应用SHapley加法解释(SHAP)来解释最终模型,同时使用决策曲线分析(DCA)评估临床净效益。
在推导队列中,599例(73.3%)患者CT扫描正常,218例(26.7%)患者CT扫描异常。梯度提升分类器(GBC)模型在七种ML模型中表现最佳,AUC为0.932(95%CI:0.900 - 0.963)。根据特征重要性排名将特征减少到21个(8个生化测试指标、3个凝血标志物和10个全血细胞计数指标)后,建立了一个可解释的GBC最终模型。最终模型在内部(AUC 0.926,95%CI:0.893 - 0.958)和外部(AUC 0.904,95%CI:0.835 - 0.973)验证队列中均准确预测了CT异常的mTBI患者。在前瞻性队列中,最终GBC模型的AUC为0.885(95%CI:0.753 - 1.000),明显优于传统TBI生物标志物GFAP(AUC:0.745)和PGP9.5(AUC:0.794)。DCA显示,在概率阈值范围为0.16 - 0.93内,最终模型比“完全干预”或“不干预”策略提供了更大的净效益。SHAP分析确定D - 二聚体水平、绝对淋巴细胞和中性粒细胞计数以及血细胞比容为关键高危特征。
我们基于最优特征选择的ML模型使用常规测试数据准确可靠地预测了mTBI患者的CT异常。通过SHAP和DCA分析解决了临床医生对透明度和决策的担忧,我们增强了ML模型潜在的临床适用性。
湖北省自然科学基金、湖北中医药大学高层次人才科研启动基金、湖北省武汉市卫生和计划生育科研基金项目以及基于机器学习的AFP阴性肝癌智能诊断系统项目。