利用电子健康记录和相关医疗保险索赔数据的自然语言处理提高痛风发作自动识别的准确性。

Improving the accuracy of automated gout flare ascertainment using natural language processing of electronic health records and linked Medicare claims data.

机构信息

Division of Rheumatology, Inflammation, and Immunity, Department of Medicine, Brigham and Women's Hospital, Boston, Massachusetts, USA.

Department of Medicine, Harvard Medical School, Boston, Massachusetts, USA.

出版信息

Pharmacoepidemiol Drug Saf. 2024 Jan;33(1):e5684. doi: 10.1002/pds.5684. Epub 2023 Aug 31.

DOI:10.1002/pds.5684

PMID:37654015

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10873073/

Abstract

BACKGROUND

We aimed to determine whether integrating concepts from the notes from the electronic health record (EHR) data using natural language processing (NLP) could improve the identification of gout flares.

METHODS

Using Medicare claims linked with EHR, we selected gout patients who initiated the urate-lowering therapy (ULT). Patients' 12-month baseline period and on-treatment follow-up were segmented into 1-month units. We retrieved EHR notes for months with gout diagnosis codes and processed notes for NLP concepts. We selected a random sample of 500 patients and reviewed each of their notes for the presence of a physician-documented gout flare. Months containing at least 1 note mentioning gout flares were considered months with events. We used 60% of patients to train predictive models with LASSO. We evaluated the models by the area under the curve (AUC) in the validation data and examined positive/negative predictive values (P/NPV).

RESULTS

We extracted and labeled 839 months of follow-up (280 with gout flares). The claims-only model selected 20 variables (AUC = 0.69). The NLP concept-only model selected 15 (AUC = 0.69). The combined model selected 32 claims variables and 13 NLP concepts (AUC = 0.73). The claims-only model had a PPV of 0.64 [0.50, 0.77] and an NPV of 0.71 [0.65, 0.76], whereas the combined model had a PPV of 0.76 [0.61, 0.88] and an NPV of 0.71 [0.65, 0.76].

CONCLUSION

Adding NLP concept variables to claims variables resulted in a small improvement in the identification of gout flares. Our data-driven claims-only model and our combined claims/NLP-concept model outperformed existing rule-based claims algorithms reliant on medication use, diagnosis, and procedure codes.

摘要

背景

我们旨在确定使用自然语言处理（NLP）整合电子健康记录（EHR）数据中的笔记概念是否可以提高痛风发作的识别率。

方法

我们使用与 EHR 相关联的医疗保险索赔数据，选择开始降低尿酸治疗（ULT）的痛风患者。患者的 12 个月基线期和治疗随访期被分割为 1 个月的单位。我们检索了有痛风诊断代码的月份的 EHR 笔记，并对笔记进行了 NLP 概念处理。我们随机选择了 500 名患者的样本，并审查了他们每个人的笔记，以确定是否有医生记录的痛风发作。包含至少 1 份提及痛风发作的笔记的月份被视为有事件的月份。我们使用 60%的患者使用 LASSO 训练预测模型。我们在验证数据中通过曲线下面积（AUC）评估模型，并检查阳性/阴性预测值（PPV/NPV）。

结果

我们提取并标记了 839 个月的随访（280 个月有痛风发作）。仅索赔模型选择了 20 个变量（AUC=0.69）。仅 NLP 概念模型选择了 15 个（AUC=0.69）。综合模型选择了 32 个索赔变量和 13 个 NLP 概念（AUC=0.73）。仅索赔模型的 PPV 为 0.64 [0.50, 0.77]，NPV 为 0.71 [0.65, 0.76]，而综合模型的 PPV 为 0.76 [0.61, 0.88]，NPV 为 0.71 [0.65, 0.76]。

结论

将 NLP 概念变量添加到索赔变量中可以略微提高痛风发作的识别率。我们的数据驱动的仅索赔模型和综合的索赔/NLP 概念模型优于依赖药物使用、诊断和程序代码的现有基于规则的索赔算法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dc54/10873073/1f3fe68926f4/nihms-1940122-f0001.jpg

相似文献

Improving the accuracy of automated gout flare ascertainment using natural language processing of electronic health records and linked Medicare claims data.利用电子健康记录和相关医疗保险索赔数据的自然语言处理提高痛风发作自动识别的准确性。

Pharmacoepidemiol Drug Saf. 2024 Jan;33(1):e5684. doi: 10.1002/pds.5684. Epub 2023 Aug 31.

Using natural language processing and machine learning to identify gout flares from electronic clinical notes.利用自然语言处理和机器学习技术从电子临床记录中识别痛风发作。

Arthritis Care Res (Hoboken). 2014 Nov;66(11):1740-8. doi: 10.1002/acr.22324.

Validation of claims-based algorithms for gout flares.基于索赔的痛风发作算法的验证。

Pharmacoepidemiol Drug Saf. 2016 Jul;25(7):820-6. doi: 10.1002/pds.4044. Epub 2016 May 27.

Identification of Gout Flares in Chief Complaint Text Using Natural Language Processing.使用自然语言处理技术在主诉文本中识别痛风发作

AMIA Annu Symp Proc. 2021 Jan 25;2020:973-982. eCollection 2020.

Using natural language processing to identify opioid use disorder in electronic health record data.利用自然语言处理技术在电子健康记录数据中识别阿片类药物使用障碍。

Int J Med Inform. 2023 Feb;170:104963. doi: 10.1016/j.ijmedinf.2022.104963. Epub 2022 Dec 10.

Urate lowering therapy in patients starting hemodialysis limit gout flares occurrence: ten years retrospective study.开始血液透析的患者降尿酸治疗可限制痛风发作的发生：一项十年回顾性研究。

BMC Nephrol. 2024 Aug 20;25(1):266. doi: 10.1186/s12882-024-03712-w.

Leveraging Natural Language Processing to Improve Electronic Health Record Suicide Risk Prediction for Veterans Health Administration Users.利用自然语言处理提高退伍军人健康管理局用户电子健康记录自杀风险预测

J Clin Psychiatry. 2023 Jun 19;84(4):22m14568. doi: 10.4088/JCP.22m14568.

Scalable Feature Engineering from Electronic Free Text Notes to Supplement Confounding Adjustment of Claims-Based Pharmacoepidemiologic Studies.从电子自由文本注释中可扩展的特征工程，以补充基于索赔的药物流行病学研究的混杂调整。

Clin Pharmacol Ther. 2023 Apr;113(4):832-838. doi: 10.1002/cpt.2826. Epub 2023 Jan 11.

Patient and clinical characteristics associated with gout flares in an integrated healthcare system.综合医疗系统中与痛风发作相关的患者及临床特征

Rheumatol Int. 2015 Nov;35(11):1799-807. doi: 10.1007/s00296-015-3284-3. Epub 2015 May 20.

Natural language processing to identify lupus nephritis phenotype in electronic health records.利用自然语言处理技术在电子健康记录中识别狼疮性肾炎表型。

BMC Med Inform Decis Mak. 2024 Mar 3;22(Suppl 2):348. doi: 10.1186/s12911-024-02420-7.

引用本文的文献

Improving Clinical Documentation with Artificial Intelligence: A Systematic Review.利用人工智能改善临床文档记录：一项系统综述。

Perspect Health Inf Manag. 2024 Jun 1;21(2):1d. eCollection 2024 Summer-Fall.

Advancing rheumatology with natural language processing: insights and prospects from a systematic review.利用自然语言处理推动风湿病学发展：系统评价的见解与展望

Rheumatol Adv Pract. 2024 Sep 19;8(4):rkae120. doi: 10.1093/rap/rkae120. eCollection 2024.

本文引用的文献

2020 American College of Rheumatology Guideline for the Management of Gout.2020 年美国风湿病学会痛风管理指南。

Arthritis Rheumatol. 2020 Jun;72(6):879-895. doi: 10.1002/art.41247. Epub 2020 May 11.

Classifying Pseudogout Using Machine Learning Approaches With Electronic Health Record Data.基于电子健康记录数据使用机器学习方法对假性痛风进行分类。

Arthritis Care Res (Hoboken). 2021 Mar;73(3):442-448. doi: 10.1002/acr.24132.

Incorporating natural language processing to improve classification of axial spondyloarthritis using electronic health records.利用电子健康记录纳入自然语言处理以改善轴性脊柱关节炎的分类。

Rheumatology (Oxford). 2020 May 1;59(5):1059-1065. doi: 10.1093/rheumatology/kez375.

Gout, Hyperuricaemia and Crystal-Associated Disease Network (G-CAN) consensus statement regarding labels and definitions of disease states of gout.痛风、高尿酸血症及晶体相关性疾病网络（G-CAN）关于痛风疾病状态的命名和定义的共识声明。

Ann Rheum Dis. 2019 Nov;78(11):1592-1600. doi: 10.1136/annrheumdis-2019-215933. Epub 2019 Sep 9.

Out-of-system Care and Recording of Patient Characteristics Critical for Comparative Effectiveness Research.系统外医疗护理和患者特征记录对比较疗效研究至关重要。

Epidemiology. 2018 May;29(3):356-363. doi: 10.1097/EDE.0000000000000794.

Brief Report: Validation of a Definition of Flare in Patients With Established Gout.简报：已确诊痛风患者中 flares 定义的验证。

Arthritis Rheumatol. 2018 Mar;70(3):462-467. doi: 10.1002/art.40381. Epub 2018 Feb 6.

Validation of claims-based algorithms for gout flares.基于索赔的痛风发作算法的验证。

Pharmacoepidemiol Drug Saf. 2016 Jul;25(7):820-6. doi: 10.1002/pds.4044. Epub 2016 May 27.

Gout.痛风

Lancet. 2016 Oct 22;388(10055):2039-2052. doi: 10.1016/S0140-6736(16)00346-9. Epub 2016 Apr 21.

2015 Gout Classification Criteria: an American College of Rheumatology/European League Against Rheumatism collaborative initiative.2015 年痛风分类标准：美国风湿病学会/欧洲抗风湿病联盟合作倡议。

Arthritis Rheumatol. 2015 Oct;67(10):2557-68. doi: 10.1002/art.39254.

Developing a provisional definition of flare in patients with established gout.制定痛风患者发作的临时定义。

Arthritis Rheum. 2012 May;64(5):1508-17. doi: 10.1002/art.33483.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验