Efros Orly, Soffer Shelly, Mudrik Aya, Robinson Renana, Kenet Gili, Nadkarni Girish N, Klang Eyal
National Hemophilia Center and Institute of Thrombosis & Hemostasis, Chaim Sheba Medical Center, Tel Hashomer, Israel
Gray Faculty of Medical and Health Sciences, Tel Aviv-Yafo, Israel.
BMJ Open. 2025 Aug 12;15(8):e097016. doi: 10.1136/bmjopen-2024-097016.
OBJECTIVES: This study aimed to develop and validate a machine-learning (ML) model to predict iron deficiency without anaemia (IDWA) using routinely collected electronic health record (EHR) data. The primary hypothesis was that an ML model could achieve better accuracy in identifying low ferritin levels (<30 ng/mL) in non-anaemic patients compared with traditional methods. DESIGN: A retrospective cohort study. SETTING: Data were derived from secondary and tertiary care facilities within the eight-hospital Mount Sinai Health System, an urban academic health system. PARTICIPANTS: The study included 211 486 adult patients (aged ≥18 years) with normal haemoglobin levels (≥130 g/L for men and ≥120 g/L for women) and recorded ferritin measurements. PRIMARY AND SECONDARY OUTCOME MEASURES: The primary outcome was the prediction of low ferritin levels (<30 ng/mL) using extreme gradient-boosted decision trees, an ML algorithm suited for structured clinical data. Secondary outcomes included subgroup analyses stratified by sex and age to evaluate model performance in different populations.Data from 211 486 Mount Sinai Health System patients with normal haemoglobin levels and ferritin testing were analysed. The model used demographic data, blood count indices and chemistry results to identify low ferritin levels (<30 ng/mL). RESULTS: Of the 211 486 patients analysed, 19.56% (n=41 368) of the patients had low ferritin levels. In the low ferritin group, the mean age was 41.28 years with 89.64% females. In contrast, the normal ferritin group had a mean age of 50.14 years with 62.02% females. The model achieved an area under the curve (AUC) of 0.814. At a sensitivity threshold of 70%, the model had a specificity of 75.85%, with a positive predictive value of 37.6% and a negative predictive value of 92.41%. The model outperformed an alternative model based only on complete blood count indices (AUC 0.814 vs 0.741). Subgroup analysis showed that model accuracy varied by sex and age, with lower performance in premenopausal women (AUC 0.736) compared with postmenopausal women (AUC 0.793) and men (AUC of 0.832 in those under 60 years and 0.806 in those aged 60 and above). CONCLUSIONS: The ML model provides an effective approach to screening for IDWA using readily available EHR data. Implementing this tool in clinical settings may facilitate early diagnosis of IDWA.
目的:本研究旨在开发并验证一种机器学习(ML)模型,该模型使用常规收集的电子健康记录(EHR)数据来预测非贫血性缺铁(IDWA)。主要假设是,与传统方法相比,ML模型在识别非贫血患者低铁蛋白水平(<30 ng/mL)方面能实现更高的准确性。 设计:一项回顾性队列研究。 背景:数据来源于城市学术医疗系统西奈山医疗系统内的八家二级和三级医疗机构。 参与者:该研究纳入了211486名成年患者(年龄≥18岁),其血红蛋白水平正常(男性≥130 g/L,女性≥120 g/L)且有铁蛋白测量记录。 主要和次要结局指标:主要结局是使用极端梯度提升决策树预测低铁蛋白水平(<30 ng/mL),极端梯度提升决策树是一种适用于结构化临床数据的ML算法。次要结局包括按性别和年龄分层的亚组分析,以评估模型在不同人群中的性能。对来自西奈山医疗系统的211486名血红蛋白水平正常且进行了铁蛋白检测的患者的数据进行了分析。该模型使用人口统计学数据、血细胞计数指标和化学检测结果来识别低铁蛋白水平(<30 ng/mL)。 结果:在分析的211486名患者中,19.56%(n = 41368)的患者铁蛋白水平较低。在低铁蛋白组中,平均年龄为41.28岁,女性占89.64%。相比之下,正常铁蛋白组的平均年龄为50.14岁,女性占62.02%。该模型的曲线下面积(AUC)为0.814。在灵敏度阈值为70%时,该模型的特异度为75.85%,阳性预测值为37.6%,阴性预测值为92.41%。该模型优于仅基于全血细胞计数指标的替代模型(AUC为0.814 vs 0.741)。亚组分析表明,模型准确性因性别和年龄而异,绝经前女性(AUC 0.736)的表现低于绝经后女性(AUC 0.793)和男性(60岁以下男性AUC为0.832,60岁及以上男性AUC为0.806)。 结论:ML模型提供了一种利用现成的EHR数据筛查IDWA的有效方法。在临床环境中应用此工具可能有助于IDWA的早期诊断。
Cochrane Database Syst Rev. 2022-4-21
Cochrane Database Syst Rev. 2016-4-18
Cochrane Database Syst Rev. 2024-12-13
Clin Orthop Relat Res. 2024-9-1
Am J Clin Pathol. 2024-9-3
Am J Obstet Gynecol. 2023-7
Am J Public Health. 2022-10
MMW Fortschr Med. 2022-3
Clin Chem Lab Med. 2022-11-25
Aust Prescr. 2021-12