Van Dung Hoang, Tan Vu Manh, Dieu Nguyen Thi, Van Linh Pham, Van Khai Nguyen, Ngan Tran Thi, Phuong Nguyen Thi Thu
Department of Internal Medicine, Hai Phong International Hospital, Hai Phong, 180000, Vietnam.
Department of Internal Medicine, Hai Phong University of Medicine and Pharmacy, Hai Phong, 180000, Vietnam.
BMC Med Inform Decis Mak. 2025 Jul 15;25(1):265. doi: 10.1186/s12911-025-03107-3.
Drug-induced immune thrombocytopenia (DITP) is a rare but potentially life-threatening adverse drug reaction, often underrecognized due to its nonspecific presentation and the lack of real-time diagnostic tools. Early identification of at-risk patients is critical to improving medication safety and preventing severe complications.
To develop and externally validate a machine learning model for predicting the risk of DITP using routinely collected hospital data, and to optimize its clinical applicability through threshold adjustment.
We conducted a retrospective cohort study using electronic medical records from Hai Phong International Hospital (2018-2024) for model development and internal validation. An independent cohort from Hai Phong International Hospital - Vinh Bao (2024) served as external validation. Eligible patients received at least one drug previously implicated in DITP and had serial platelet counts. A Light Gradient Boosting Machine (LightGBM) model was trained on demographic, clinical, laboratory, and pharmacological features. Model performance was assessed using area under the ROC curve (AUC), accuracy, recall, and F1-score. Shapley Additive explanations (SHAP) were used to interpret feature contributions. Threshold tuning and decision curve analysis (DCA) supported clinical applicability.
Among 17,546 patients in the training cohort and 1,403 in the external cohort, DITP occurred in 432 (2.46%) and 70 (4.99%) patients, respectively. In internal validation, LightGBM achieved an AUC of 0.860, recall of 0.392, and F1-score of 0.310. External validation confirmed model robustness with an AUC of 0.813 and an F1-score of 0.341 at the optimized threshold (0.09). SHAP analysis identified AST, baseline platelet count, and renal function as key contributors. DCA and clinical impact curves demonstrated potential benefit in supporting real-time risk stratification. Clopidogrel and vancomycin were frequently associated with suspected DITP cases.
This externally validated machine learning model enables early identification of hospitalized patients at risk of DITP using data available in routine care. Its integration into electronic medical records may support clinical decision-making, reduce diagnostic delays, and improve pharmacovigilance practices in hospital settings.
药物性免疫性血小板减少症(DITP)是一种罕见但可能危及生命的药物不良反应,由于其表现不具特异性且缺乏实时诊断工具,常常未得到充分认识。早期识别高危患者对于提高用药安全性和预防严重并发症至关重要。
利用常规收集的医院数据开发并外部验证一个用于预测DITP风险的机器学习模型,并通过阈值调整优化其临床适用性。
我们使用海防国际医院(2018 - 2024年)的电子病历进行了一项回顾性队列研究,用于模型开发和内部验证。来自海防国际医院 - 云宝分院(2024年)的一个独立队列用作外部验证。符合条件的患者至少接受过一种先前与DITP相关的药物治疗,并进行了系列血小板计数。基于人口统计学、临床、实验室和药理学特征训练了一个轻梯度提升机(LightGBM)模型。使用ROC曲线下面积(AUC)、准确性、召回率和F1分数评估模型性能。使用Shapley值加法解释(SHAP)来解释特征贡献。阈值调整和决策曲线分析(DCA)支持临床适用性。
在训练队列的17546例患者和外部队列的1403例患者中,分别有432例(2.46%)和70例(4.99%)发生了DITP。在内部验证中,LightGBM的AUC为0.860,召回率为0.392,F分数为0.310。外部验证在优化阈值(0.09)时确认了模型的稳健性,AUC为0.813,F分数为0.341。SHAP分析确定谷草转氨酶、基线血小板计数和肾功能为关键贡献因素。DCA和临床影响曲线表明在支持实时风险分层方面具有潜在益处。氯吡格雷和万古霉素经常与疑似DITP病例相关。
这个经过外部验证的机器学习模型能够利用常规护理中可用的数据早期识别有DITP风险的住院患者。将其整合到电子病历中可能支持临床决策,减少诊断延迟,并改善医院环境中的药物警戒实践。