Suppr超能文献

利用机器学习在医疗系统中发现家族性高胆固醇血症的漏诊病例。

Finding missed cases of familial hypercholesterolemia in health systems using machine learning.

作者信息

Banda Juan M, Sarraju Ashish, Abbasi Fahim, Parizo Justin, Pariani Mitchel, Ison Hannah, Briskin Elinor, Wand Hannah, Dubois Sebastien, Jung Kenneth, Myers Seth A, Rader Daniel J, Leader Joseph B, Murray Michael F, Myers Kelly D, Wilemon Katherine, Shah Nigam H, Knowles Joshua W

机构信息

1Center for Biomedical Informatics Research, Stanford University, Stanford, CA USA.

2Department of Computer Science, Georgia State University, Atlanta, GA USA.

出版信息

NPJ Digit Med. 2019 Apr 11;2:23. doi: 10.1038/s41746-019-0101-5. eCollection 2019.

Abstract

Familial hypercholesterolemia (FH) is an underdiagnosed dominant genetic condition affecting approximately 0.4% of the population and has up to a 20-fold increased risk of coronary artery disease if untreated. Simple screening strategies have false positive rates greater than 95%. As part of the FH Foundation's FIND FH initiative, we developed a classifier to identify potential FH patients using electronic health record (EHR) data at Stanford Health Care. We trained a random forest classifier using data from known patients ( = 197) and matched non-cases ( = 6590). Our classifier obtained a positive predictive value (PPV) of 0.88 and sensitivity of 0.75 on a held-out test-set. We evaluated the accuracy of the classifier's predictions by chart review of 100 patients at risk of FH not included in the original dataset. The classifier correctly flagged 84% of patients at the highest probability threshold, with decreasing performance as the threshold lowers. In external validation on 466 FH patients (236 with genetically proven FH) and 5000 matched non-cases from the Geisinger Healthcare System our FH classifier achieved a PPV of 0.85. Our EHR-derived FH classifier is effective in finding candidate patients for further FH screening. Such machine learning guided strategies can lead to effective identification of the highest risk patients for enhanced management strategies.

摘要

家族性高胆固醇血症(FH)是一种诊断不足的显性遗传病,影响约0.4%的人口,若不治疗,患冠状动脉疾病的风险会增加20倍。简单的筛查策略假阳性率超过95%。作为FH基金会“发现FH”倡议的一部分,我们开发了一种分类器,利用斯坦福医疗保健公司的电子健康记录(EHR)数据来识别潜在的FH患者。我们使用已知患者(n = 197)和匹配的非病例(n = 6590)的数据训练了一个随机森林分类器。在一个留出的测试集上,我们的分类器获得了0.88的阳性预测值(PPV)和0.75的灵敏度。我们通过对原始数据集中未包含的100名有FH风险的患者进行病历审查,评估了分类器预测的准确性。在最高概率阈值下,分类器正确标记了84%的患者,随着阈值降低,性能下降。在对来自盖辛格医疗系统的466名FH患者(236名经基因证实为FH)和5000名匹配的非病例进行外部验证时,我们的FH分类器获得了0.85的PPV。我们基于电子健康记录的FH分类器在寻找进一步进行FH筛查的候选患者方面是有效的。这种机器学习指导的策略可以有效识别出风险最高的患者,以加强管理策略。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/01fe/6550268/69ca661c8e5e/41746_2019_101_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验