Suppr超能文献

利用电子健康记录纳入自然语言处理以改善轴性脊柱关节炎的分类。

Incorporating natural language processing to improve classification of axial spondyloarthritis using electronic health records.

作者信息

Zhao Sizheng Steven, Hong Chuan, Cai Tianrun, Xu Chang, Huang Jie, Ermann Joerg, Goodson Nicola J, Solomon Daniel H, Cai Tianxi, Liao Katherine P

机构信息

Institute of Ageing and Chronic Disease, University of Liverpool.

Department of Academic Rheumatology, Aintree University Hospital, Liverpool, UK.

出版信息

Rheumatology (Oxford). 2020 May 1;59(5):1059-1065. doi: 10.1093/rheumatology/kez375.

Abstract

OBJECTIVES

To develop classification algorithms that accurately identify axial SpA (axSpA) patients in electronic health records, and compare the performance of algorithms incorporating free-text data against approaches using only International Classification of Diseases (ICD) codes.

METHODS

An enriched cohort of 7853 eligible patients was created from electronic health records of two large hospitals using automated searches (⩾1 ICD codes combined with simple text searches). Key disease concepts from free-text data were extracted using NLP and combined with ICD codes to develop algorithms. We created both supervised regression-based algorithms-on a training set of 127 axSpA cases and 423 non-cases-and unsupervised algorithms to identify patients with high probability of having axSpA from the enriched cohort. Their performance was compared against classifications using ICD codes only.

RESULTS

NLP extracted four disease concepts of high predictive value: ankylosing spondylitis, sacroiliitis, HLA-B27 and spondylitis. The unsupervised algorithm, incorporating both the NLP concept and ICD code for AS, identified the greatest number of patients. By setting the probability threshold to attain 80% positive predictive value, it identified 1509 axSpA patients (mean age 53 years, 71% male). Sensitivity was 0.78, specificity 0.94 and area under the curve 0.93. The two supervised algorithms performed similarly but identified fewer patients. All three outperformed traditional approaches using ICD codes alone (area under the curve 0.80-0.87).

CONCLUSION

Algorithms incorporating free-text data can accurately identify axSpA patients in electronic health records. Large cohorts identified using these novel methods offer exciting opportunities for future clinical research.

摘要

目的

开发能够在电子健康记录中准确识别轴性脊柱关节炎(axSpA)患者的分类算法,并比较纳入自由文本数据的算法与仅使用国际疾病分类(ICD)编码的方法的性能。

方法

通过自动化搜索(⩾1个ICD编码与简单文本搜索相结合),从两家大型医院的电子健康记录中创建了一个由7853名符合条件的患者组成的丰富队列。使用自然语言处理(NLP)从自由文本数据中提取关键疾病概念,并与ICD编码相结合以开发算法。我们创建了基于监督回归的算法(在一个包含127例axSpA病例和423例非病例的训练集上)以及无监督算法,以从丰富队列中识别出患有axSpA可能性高的患者。将它们的性能与仅使用ICD编码的分类进行比较。

结果

NLP提取了四个具有高预测价值的疾病概念:强直性脊柱炎、骶髂关节炎、HLA - B27和脊柱炎。结合了NLP概念和AS的ICD编码的无监督算法识别出的患者数量最多。通过将概率阈值设置为达到80%的阳性预测值,它识别出1509例axSpA患者(平均年龄53岁,71%为男性)。敏感性为0.78,特异性为0.94,曲线下面积为0.93。两种监督算法表现相似,但识别出的患者较少。所有三种算法的表现均优于仅使用ICD编码的传统方法(曲线下面积为0.80 - 0.87)。

结论

纳入自由文本数据的算法能够在电子健康记录中准确识别axSpA患者。使用这些新方法识别出的大型队列可为未来的临床研究提供令人兴奋的机会。

相似文献

7
Identifying Axial Spondyloarthritis in Electronic Medical Records of US Veterans.在美国退伍军人电子病历中识别轴性脊柱关节炎
Arthritis Care Res (Hoboken). 2017 Sep;69(9):1414-1420. doi: 10.1002/acr.23140. Epub 2017 Aug 8.

引用本文的文献

4
Large language models and rheumatology: are we there yet?大语言模型与风湿病学:我们到那儿了吗?
Rheumatol Adv Pract. 2024 Sep 18;9(2):rkae119. doi: 10.1093/rap/rkae119. eCollection 2025.
6
Rheumatology in the digital health era: status quo and quo vadis?数字健康时代的风湿病学:现状与未来走向?
Nat Rev Rheumatol. 2024 Dec;20(12):747-759. doi: 10.1038/s41584-024-01177-7. Epub 2024 Oct 31.

本文引用的文献

8
Identifying Axial Spondyloarthritis in Electronic Medical Records of US Veterans.在美国退伍军人电子病历中识别轴性脊柱关节炎
Arthritis Care Res (Hoboken). 2017 Sep;69(9):1414-1420. doi: 10.1002/acr.23140. Epub 2017 Aug 8.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验