Suppr超能文献

糖尿病视网膜病变患者分类中诊断编码与临床记录的比较

Comparison of Diagnosis Codes to Clinical Notes in Classifying Patients with Diabetic Retinopathy.

作者信息

Yonamine Sean, Ma Chu Jian, Alabi Rolake O, Kaidonis Georgia, Chan Lawrence, Borkar Durga, Stein Joshua D, Arnold Benjamin F, Sun Catherine Q

机构信息

Department of Ophthalmology, University of California, San Francisco, California.

Bloomberg School of Public Health, Johns Hopkins University, Baltimore, Maryland.

出版信息

Ophthalmol Sci. 2024 Jun 14;4(6):100564. doi: 10.1016/j.xops.2024.100564. eCollection 2024 Nov-Dec.

Abstract

PURPOSE

Electronic health records (EHRs) contain a vast amount of clinical data. Improved automated classification approaches have the potential to accurately and efficiently identify patient cohorts for research. We evaluated if a rule-based natural language processing (NLP) algorithm using clinical notes performed better for classifying proliferative diabetic retinopathy (PDR) and nonproliferative diabetic retinopathy (NPDR) severity compared with International Classification of Diseases, ninth edition (ICD-9) or 10th edition (ICD-10) codes.

DESIGN

Cross-sectional study.

SUBJECTS

Deidentified EHR data from an academic medical center identified 2366 patients aged ≥18 years, with diabetes mellitus, diabetic retinopathy (DR), and available clinical notes.

METHODS

From these 2366 patients, 306 random patients (100 training set, 206 test set) underwent chart review by ophthalmologists to establish the gold standard. International Classification of Diseases codes were extracted from the EHR. The notes algorithm identified positive mention of PDR and NPDR severity from clinical notes. Proliferative diabetic retinopathy and NPDR severity classification by ICD codes and the notes algorithm were compared with the gold standard. The entire DR cohort (N = 2366) was then classified as having presence (or absence) of PDR using ICD codes and the notes algorithm.

MAIN OUTCOME MEASURES

Sensitivity, specificity, positive predictive value (PPV), negative predictive value, and F1 score for the notes algorithm compared with ICD codes using a gold standard of chart review.

RESULTS

For PDR classification of the test set patients, the notes algorithm performed better than ICD codes for all metrics. Specifically, the notes algorithm had significantly higher sensitivity (90.5% [95% confidence interval 85.7, 94.9] vs. 68.4% [60.4, 75.3]), but similar PPV (98.0% [95.4-100] vs. 94.7% [90.3, 98.3]) respectively. The F1 score was 0.941 [0.910, 0.966] for the notes algorithm compared with 0.794 [0.734, 0.842] for ICD codes. For PDR classification, ICD-10 codes performed better than ICD-9 codes (F1 score 0.836 [0.771, 0.878] vs. 0.596 [0.222, 0.692]). For NPDR severity classification, the notes algorithm performed similarly to ICD codes, but performance was limited by small sample size.

CONCLUSIONS

The notes algorithm outperformed ICD codes for PDR classification. The findings demonstrate the significant potential of applying a rule-based NLP algorithm to clinical notes to increase the efficiency and accuracy of cohort selection for research.

FINANCIAL DISCLOSURES

Proprietary or commercial disclosure may be found in the Footnotes and Disclosures at the end of this article.

摘要

目的

电子健康记录(EHR)包含大量临床数据。改进的自动分类方法有潜力准确且高效地识别用于研究的患者队列。我们评估了一种基于规则的自然语言处理(NLP)算法,该算法使用临床记录,与国际疾病分类第九版(ICD - 9)或第十版(ICD - 10)编码相比,在对增殖性糖尿病视网膜病变(PDR)和非增殖性糖尿病视网膜病变(NPDR)严重程度进行分类时是否表现更优。

设计

横断面研究。

研究对象

来自一家学术医疗中心的去识别化EHR数据,识别出2366名年龄≥18岁、患有糖尿病、糖尿病视网膜病变(DR)且有可用临床记录的患者。

方法

从这2366名患者中,随机选取306名患者(100名用于训练集,206名用于测试集),由眼科医生进行病历审查以确立金标准。从EHR中提取国际疾病分类编码。记录算法从临床记录中识别出PDR和NPDR严重程度的阳性提及。将ICD编码和记录算法对增殖性糖尿病视网膜病变和NPDR严重程度的分类与金标准进行比较。然后使用ICD编码和记录算法将整个DR队列(N = 2366)分类为患有(或未患有)PDR。

主要观察指标

使用病历审查的金标准,将记录算法与ICD编码进行比较时的敏感性、特异性、阳性预测值(PPV)、阴性预测值和F1分数。

结果

对于测试集患者的PDR分类,记录算法在所有指标上的表现均优于ICD编码。具体而言,记录算法的敏感性显著更高(90.5% [95%置信区间85.7, 94.9] 对 68.4% [60.4, 75.3]),但PPV相似(98.0% [95.4 - 100] 对 94.7% [90.3, 98.3])。记录算法的F1分数为0.941 [0.910, 0.966],而ICD编码的F1分数为0.794 [0.734, 0.842]。对于PDR分类,ICD - 10编码的表现优于ICD - 9编码(F1分数0.836 [0.771, 0.878] 对 0.596 [0.222, 0.692])。对于NPDR严重程度分类,记录算法的表现与ICD编码相似,但由于样本量小,性能受到限制。

结论

记录算法在PDR分类方面优于ICD编码。研究结果表明,将基于规则的NLP算法应用于临床记录以提高研究队列选择的效率和准确性具有巨大潜力。

财务披露

在本文末尾的脚注和披露中可能会找到专有或商业披露信息。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1b22/11382306/d60cae3b76ed/gr1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验