Suppr超能文献

一种用于检测英国初级保健电子健康记录中阵发性夜间血红蛋白尿(PNH)的机器学习算法。

A machine learning algorithm for the detection of paroxysmal nocturnal haemoglobinuria (PNH) in UK primary care electronic health records.

机构信息

Mendelian, The Trampery, Old Street, London, UK.

St. James's University Hospital, Leeds, UK.

出版信息

Orphanet J Rare Dis. 2024 Oct 13;19(1):378. doi: 10.1186/s13023-024-03406-4.

Abstract

BACKGROUND

Paroxysmal Nocturnal Haemoglobinuria (PNH) is an ultra-rare, acquired disorder that is challenging to diagnose due to varied symptoms, heterogeneous patient presentations, and lack of awareness among healthcare professionals. This leads to frequent misdiagnosis and delays in diagnosis. This study evaluated the feasibility of a machine learning model to identify undiagnosed PNH patients using structured electronic health records.

METHODS

The study used data from the Optimum Patient Care Research Database, which contains electronic health records from general practitioner (GP) practices across the United Kingdom. PNH patients were identified by the presence, and control patients by the absence of a PNH diagnosis code in their records. Clinical features (symptoms, diagnoses, healthcare utilisation) from 131 patients in the PNH group and 593,838 patients in the control group, were inputted to a tree-based XGBoost machine learning model to classify patients as either "positive" or "negative" for PNH suspicion. The algorithm was finalised after additional exclusions and inclusions applied. Performance was assessed using positive predictive value (PPV), recall and specificity. As the sample used to develop the algorithm was not representative of the true population prevalence, PPV was additionally adjusted to reflect performance in the wider population.

RESULTS

Of all the patients in the PNH group, 27% were classified as positive (recall). 99.99% of the control group were classified as negative (specificity). Of all the patients classified as positive, 60.4% had a diagnosis of PNH in their record (PPV). The PPV adjusted for the population prevalence of PNH was 19.59 suggesting nearly 1 in 5 patients flagged may warrant further PNH investigation. The key clinical features in the model were aplastic anaemia, pancytopenia, haemolytic anaemia, myelodysplastic syndrome, and Budd-Chiari syndrome.

CONCLUSION

This is the first study to combine clinical understanding of PNH with machine learning, demonstrating the ability to discriminate between PNH and control patients in retrospective electronic health records. With further investigation and validation, this algorithm could be deployed on live health data, potentially leading to earlier diagnosis for patients who currently experience long diagnostic delays or remain undiagnosed.

摘要

背景

阵发性睡眠性血红蛋白尿症(PNH)是一种罕见的后天获得性疾病,由于症状多样、患者表现异质性以及医疗保健专业人员缺乏认识,导致诊断困难,从而导致频繁误诊和诊断延迟。本研究评估了使用结构化电子健康记录通过机器学习模型识别未确诊 PNH 患者的可行性。

方法

本研究使用了来自 Optimum Patient Care Research Database 的数据,该数据库包含来自英国全科医生(GP)诊所的电子健康记录。PNH 患者通过其记录中存在 PNH 诊断代码来识别,而对照组患者则通过其记录中不存在 PNH 诊断代码来识别。将 131 例 PNH 组患者和 593838 例对照组患者的临床特征(症状、诊断、医疗保健利用情况)输入到基于树的 XGBoost 机器学习模型中,以将患者分为 PNH 可疑“阳性”或“阴性”。在进一步排除和纳入后确定了算法。使用阳性预测值(PPV)、召回率和特异性来评估性能。由于用于开发算法的样本不能代表真实人群的流行率,因此还调整了 PPV 以反映更广泛人群中的性能。

结果

在所有 PNH 组患者中,27%被归类为阳性(召回率)。99.99%的对照组患者被归类为阴性(特异性)。在所有被归类为阳性的患者中,有 60.4%的患者在其记录中被诊断为 PNH(PPV)。根据 PNH 的人群流行率调整后的 PPV 为 19.59%,这表明近五分之一被标记的患者可能需要进一步的 PNH 调查。模型中的关键临床特征是再生障碍性贫血、全血细胞减少、溶血性贫血、骨髓增生异常综合征和布加综合征。

结论

这是第一项将 PNH 的临床理解与机器学习相结合的研究,证明了在回顾性电子健康记录中区分 PNH 和对照组患者的能力。通过进一步的研究和验证,该算法可以部署在实时健康数据上,为那些目前经历长诊断延迟或仍然未被诊断的患者提供更早的诊断。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/84db/11479535/d2cb5516e3f3/13023_2024_3406_Fig1_HTML.jpg

相似文献

本文引用的文献

7
Update on the diagnosis and management of paroxysmal nocturnal hemoglobinuria.阵发性夜间血红蛋白尿的诊断与管理最新进展
Hematology Am Soc Hematol Educ Program. 2016 Dec 2;2016(1):208-216. doi: 10.1182/asheducation-2016.1.208.
10
Diagnosis and management of paroxysmal nocturnal hemoglobinuria.阵发性夜间血红蛋白尿的诊断与管理
Blood. 2005 Dec 1;106(12):3699-709. doi: 10.1182/blood-2005-04-1717. Epub 2005 Jul 28.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验