糖尿病视网膜病变患者分类中诊断编码与临床记录的比较

Comparison of Diagnosis Codes to Clinical Notes in Classifying Patients with Diabetic Retinopathy.

作者信息

Yonamine Sean, Ma Chu Jian, Alabi Rolake O, Kaidonis Georgia, Chan Lawrence, Borkar Durga, Stein Joshua D, Arnold Benjamin F, Sun Catherine Q

机构信息

Department of Ophthalmology, University of California, San Francisco, California.

Bloomberg School of Public Health, Johns Hopkins University, Baltimore, Maryland.

出版信息

Ophthalmol Sci. 2024 Jun 14;4(6):100564. doi: 10.1016/j.xops.2024.100564. eCollection 2024 Nov-Dec.

DOI:10.1016/j.xops.2024.100564

PMID:39253554

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11382306/

Abstract

PURPOSE

Electronic health records (EHRs) contain a vast amount of clinical data. Improved automated classification approaches have the potential to accurately and efficiently identify patient cohorts for research. We evaluated if a rule-based natural language processing (NLP) algorithm using clinical notes performed better for classifying proliferative diabetic retinopathy (PDR) and nonproliferative diabetic retinopathy (NPDR) severity compared with International Classification of Diseases, ninth edition (ICD-9) or 10th edition (ICD-10) codes.

DESIGN

Cross-sectional study.

SUBJECTS

Deidentified EHR data from an academic medical center identified 2366 patients aged ≥18 years, with diabetes mellitus, diabetic retinopathy (DR), and available clinical notes.

METHODS

From these 2366 patients, 306 random patients (100 training set, 206 test set) underwent chart review by ophthalmologists to establish the gold standard. International Classification of Diseases codes were extracted from the EHR. The notes algorithm identified positive mention of PDR and NPDR severity from clinical notes. Proliferative diabetic retinopathy and NPDR severity classification by ICD codes and the notes algorithm were compared with the gold standard. The entire DR cohort (N = 2366) was then classified as having presence (or absence) of PDR using ICD codes and the notes algorithm.

MAIN OUTCOME MEASURES

Sensitivity, specificity, positive predictive value (PPV), negative predictive value, and F1 score for the notes algorithm compared with ICD codes using a gold standard of chart review.

RESULTS

For PDR classification of the test set patients, the notes algorithm performed better than ICD codes for all metrics. Specifically, the notes algorithm had significantly higher sensitivity (90.5% [95% confidence interval 85.7, 94.9] vs. 68.4% [60.4, 75.3]), but similar PPV (98.0% [95.4-100] vs. 94.7% [90.3, 98.3]) respectively. The F1 score was 0.941 [0.910, 0.966] for the notes algorithm compared with 0.794 [0.734, 0.842] for ICD codes. For PDR classification, ICD-10 codes performed better than ICD-9 codes (F1 score 0.836 [0.771, 0.878] vs. 0.596 [0.222, 0.692]). For NPDR severity classification, the notes algorithm performed similarly to ICD codes, but performance was limited by small sample size.

CONCLUSIONS

The notes algorithm outperformed ICD codes for PDR classification. The findings demonstrate the significant potential of applying a rule-based NLP algorithm to clinical notes to increase the efficiency and accuracy of cohort selection for research.

FINANCIAL DISCLOSURES

Proprietary or commercial disclosure may be found in the Footnotes and Disclosures at the end of this article.

摘要

目的

电子健康记录（EHR）包含大量临床数据。改进的自动分类方法有潜力准确且高效地识别用于研究的患者队列。我们评估了一种基于规则的自然语言处理（NLP）算法，该算法使用临床记录，与国际疾病分类第九版（ICD - 9）或第十版（ICD - 10）编码相比，在对增殖性糖尿病视网膜病变（PDR）和非增殖性糖尿病视网膜病变（NPDR）严重程度进行分类时是否表现更优。

设计

横断面研究。

研究对象

来自一家学术医疗中心的去识别化EHR数据，识别出2366名年龄≥18岁、患有糖尿病、糖尿病视网膜病变（DR）且有可用临床记录的患者。

方法

从这2366名患者中，随机选取306名患者（100名用于训练集，206名用于测试集），由眼科医生进行病历审查以确立金标准。从EHR中提取国际疾病分类编码。记录算法从临床记录中识别出PDR和NPDR严重程度的阳性提及。将ICD编码和记录算法对增殖性糖尿病视网膜病变和NPDR严重程度的分类与金标准进行比较。然后使用ICD编码和记录算法将整个DR队列（N = 2366）分类为患有（或未患有）PDR。

主要观察指标

使用病历审查的金标准，将记录算法与ICD编码进行比较时的敏感性、特异性、阳性预测值（PPV）、阴性预测值和F1分数。

结果

对于测试集患者的PDR分类，记录算法在所有指标上的表现均优于ICD编码。具体而言，记录算法的敏感性显著更高（90.5% [95%置信区间85.7, 94.9] 对 68.4% [60.4, 75.3]），但PPV相似（98.0% [95.4 - 100] 对 94.7% [90.3, 98.3]）。记录算法的F1分数为0.941 [0.910, 0.966]，而ICD编码的F1分数为0.794 [0.734, 0.842]。对于PDR分类，ICD - 10编码的表现优于ICD - 9编码（F1分数0.836 [0.771, 0.878] 对 0.596 [0.222, 0.692]）。对于NPDR严重程度分类，记录算法的表现与ICD编码相似，但由于样本量小，性能受到限制。

结论

记录算法在PDR分类方面优于ICD编码。研究结果表明，将基于规则的NLP算法应用于临床记录以提高研究队列选择的效率和准确性具有巨大潜力。

财务披露

在本文末尾的脚注和披露中可能会找到专有或商业披露信息。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1b22/11382306/d60cae3b76ed/gr1.jpg

相似文献

Comparison of Diagnosis Codes to Clinical Notes in Classifying Patients with Diabetic Retinopathy.糖尿病视网膜病变患者分类中诊断编码与临床记录的比较

Ophthalmol Sci. 2024 Jun 14;4(6):100564. doi: 10.1016/j.xops.2024.100564. eCollection 2024 Nov-Dec.

Improving the Identification of Diabetic Retinopathy and Related Conditions in the Electronic Health Record Using Natural Language Processing Methods.使用自然语言处理方法改善电子健康记录中糖尿病视网膜病变及相关病症的识别

Ophthalmol Sci. 2024 Jul 18;4(6):100578. doi: 10.1016/j.xops.2024.100578. eCollection 2024 Nov-Dec.

Effect of ICD-9 to ICD-10 Transition on Accuracy of Codes for Stage of Diabetic Retinopathy and Related Complications: Results from the CODER Study.ICD-9 到 ICD-10 转换对糖尿病视网膜病变及其相关并发症分期编码准确性的影响：来自 CODER 研究的结果。

Ophthalmol Retina. 2021 Apr;5(4):374-380. doi: 10.1016/j.oret.2020.08.004. Epub 2020 Aug 15.

Developing and Validating Models to Predict Progression to Proliferative Diabetic Retinopathy.开发和验证预测增殖性糖尿病视网膜病变进展的模型。

Ophthalmol Sci. 2023 Feb 1;3(2):100276. doi: 10.1016/j.xops.2023.100276. eCollection 2023 Jun.

The Relationship Between Health Insurance Status and Diabetic Retinopathy Progression.健康保险状况与糖尿病视网膜病变进展之间的关系。

Ophthalmol Sci. 2023 Dec 22;4(3):100458. doi: 10.1016/j.xops.2023.100458. eCollection 2024 May-Jun.

Moving Biosurveillance Beyond Coded Data Using AI for Symptom Detection From Physician Notes: Retrospective Cohort Study.利用人工智能从医生笔记中检测症状，推动生物监测超越编码数据：回顾性队列研究。

J Med Internet Res. 2024 Apr 4;26:e53367. doi: 10.2196/53367.

Using natural language processing to identify opioid use disorder in electronic health record data.利用自然语言处理技术在电子健康记录数据中识别阿片类药物使用障碍。

Int J Med Inform. 2023 Feb;170:104963. doi: 10.1016/j.ijmedinf.2022.104963. Epub 2022 Dec 10.

Natural language processing of clinical notes for identification of critical limb ischemia.临床记录的自然语言处理以识别严重肢体缺血。

Int J Med Inform. 2018 Mar;111:83-89. doi: 10.1016/j.ijmedinf.2017.12.024. Epub 2017 Dec 28.

Clinically Significant Nonperfusion Areas on Widefield OCT Angiography in Diabetic Retinopathy.糖尿病视网膜病变超广角光学相干断层扫描血管造影中的具有临床意义的无灌注区

Ophthalmol Sci. 2022 Nov 2;3(1):100241. doi: 10.1016/j.xops.2022.100241. eCollection 2023 Mar.

Variations in Electronic Health Record-Based Definitions of Diabetic Retinopathy Cohorts: A Literature Review and Quantitative Analysis.基于电子健康记录的糖尿病视网膜病变队列定义的差异：文献综述与定量分析

Ophthalmol Sci. 2024 Jan 24;4(4):100468. doi: 10.1016/j.xops.2024.100468. eCollection 2024 Jul-Aug.

引用本文的文献

Proliferative Diabetic Retinopathy Disproportionately Impacts Distressed Communities Near a Northeastern Academic Center.增殖性糖尿病视网膜病变对东北学术中心附近的贫困社区影响尤为严重。

Ophthalmol Sci. 2025 Jun 30;5(6):100872. doi: 10.1016/j.xops.2025.100872. eCollection 2025 Nov-Dec.

Developing Laterality-Specific Computable Phenotypes from Electronic Health Record Data, Employing Treatment-Warranted Diabetic Macular Edema as a Use Case.利用电子健康记录数据开发特定侧性的可计算表型，以治疗性黄斑水肿为例进行应用。

Ophthalmol Sci. 2025 Apr 16;5(5):100797. doi: 10.1016/j.xops.2025.100797. eCollection 2025 Sep-Oct.

本文引用的文献

The ICD-10 Glaucoma Severity Score Underestimates the Extent of Glaucomatous Optic Nerve Damage.ICD-10 青光眼严重程度评分低估了青光眼视神经损伤的程度。

Am J Ophthalmol. 2022 Dec;244:133-142. doi: 10.1016/j.ajo.2022.08.009. Epub 2022 Aug 23.

Assessing the Accuracy of International Classification of Diseases (ICD) Coding for Delirium.评估国际疾病分类（ICD）编码对谵妄的准确性。

J Appl Gerontol. 2022 May;41(5):1485-1490. doi: 10.1177/07334648211067526. Epub 2022 Feb 17.

A Semiautomated Chart Review for Assessing the Development of Radiation Pneumonitis Using Natural Language Processing: Diagnostic Accuracy and Feasibility Study.一项使用自然语言处理评估放射性肺炎发展情况的半自动病历审查：诊断准确性和可行性研究

JMIR Med Inform. 2021 Nov 12;9(11):e29241. doi: 10.2196/29241.

Comparing automated vs. manual data collection for COVID-specific medications from electronic health records.比较电子健康记录中 COVID 特定药物的自动数据采集与手动数据采集。

Int J Med Inform. 2022 Jan;157:104622. doi: 10.1016/j.ijmedinf.2021.104622. Epub 2021 Oct 21.

Ophthalmol Retina. 2021 Apr;5(4):374-380. doi: 10.1016/j.oret.2020.08.004. Epub 2020 Aug 15.

Protected Health Information filter (Philter): accurately and securely de-identifying free-text clinical notes.受保护的健康信息过滤器（Philter）：准确且安全地去除自由文本临床记录中的身份标识信息。

NPJ Digit Med. 2020 Apr 14;3:57. doi: 10.1038/s41746-020-0258-y. eCollection 2020.

Have ICD-10 Coding Practices Changed Since 2015?自2015年以来，国际疾病分类第十版（ICD - 10）编码规范有变化吗？

AMIA Annu Symp Proc. 2020 Mar 4;2019:804-811. eCollection 2019.

Clinical Text Data in Machine Learning: Systematic Review.机器学习中的临床文本数据：系统综述

JMIR Med Inform. 2020 Mar 31;8(3):e17984. doi: 10.2196/17984.

Rule-based and machine learning algorithms identify patients with systemic sclerosis accurately in the electronic health record.基于规则和机器学习算法可在电子健康记录中准确识别系统性硬化症患者。

Arthritis Res Ther. 2019 Dec 30;21(1):305. doi: 10.1186/s13075-019-2092-7.

Evaluation of an Algorithm for Identifying Ocular Conditions in Electronic Health Record Data.评估一种在电子健康记录数据中识别眼部疾病的算法。

JAMA Ophthalmol. 2019 May 1;137(5):491-497. doi: 10.1001/jamaophthalmol.2018.7051.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

糖尿病视网膜病变患者分类中诊断编码与临床记录的比较

Comparison of Diagnosis Codes to Clinical Notes in Classifying Patients with Diabetic Retinopathy.

作者信息

机构信息

出版信息

PURPOSE

DESIGN

SUBJECTS

METHODS

MAIN OUTCOME MEASURES

RESULTS

CONCLUSIONS

FINANCIAL DISCLOSURES

目的

设计

研究对象

方法

主要观察指标

结果

结论

财务披露

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献