Suppr
超能文献

在电子健康记录中识别狼疮患者：机器学习算法的开发和验证以及基于规则算法的应用。

Identifying lupus patients in electronic health records: Development and validation of machine learning algorithms and application of rule-based algorithms.

机构信息

Division of Rheumatology, Allergy, and Immunology, Department of Medicine, Massachusetts General Hospital, Harvard Medical School, 55 Fruit Street, Bulfinch 165, Boston, MA 02114, United States.

Research Information Systems and Computing, Partners Healthcare, United States.

出版信息

Semin Arthritis Rheum. 2019 Aug;49(1):84-90. doi: 10.1016/j.semarthrit.2019.01.002. Epub 2019 Jan 4.

DOI:10.1016/j.semarthrit.2019.01.002

PMID:30665626

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6609504/

Abstract

OBJECTIVE

To utilize electronic health records (EHRs) to study SLE, algorithms are needed to accurately identify these patients. We used machine learning to generate data-driven SLE EHR algorithms and assessed performance of existing rule-based algorithms.

METHODS

We randomly selected subjects with ≥ 1 SLE ICD-9/10 codes from our EHR and identified gold standard definite and probable SLE cases by chart review, based on 1997 ACR or 2012 SLICC Classification Criteria. From a training set, we extracted coded and narrative concepts using natural language processing and generated algorithms using penalized logistic regression to classify definite or definite/probable SLE. We assessed predictive characteristics in internal and external cohort validations. We also tested performance characteristics of published rule-based algorithms with pre-specified permutations of ICD-9 codes, laboratory tests and medications in our EHR.

RESULTS

At a specificity of 97%, our machine learning coded algorithm for definite SLE had 90% positive predictive value (PPV) and 64% sensitivity and for definite/probable SLE, 92% PPV and 47% sensitivity. In the external validation, at 97% specificity, the definite/probable algorithm had 94% PPV and 60% sensitivity. Adding NLP concepts did not improve performance metrics. The PPVs of published rule-based algorithms ranged from 45-79% in our EHR.

CONCLUSION

Our machine learning SLE algorithms performed well in internal and external validation. Rule-based SLE algorithms did not transport as well to our EHR. Unique EHR characteristics, clinical practices and research goals regarding the desired sensitivity and specificity of the case definition must be considered when applying algorithms to identify SLE patients.

摘要

目的

利用电子健康记录（EHR）研究系统性红斑狼疮（SLE），需要算法来准确识别这些患者。我们使用机器学习生成数据驱动的 SLE EHR 算法，并评估现有的基于规则的算法的性能。

方法

我们从 EHR 中随机选择具有≥1 个 SLE ICD-9/10 代码的受试者，并通过病历回顾确定金标准明确和可能的 SLE 病例，依据 1997 年 ACR 或 2012 年 SLICC 分类标准。从训练集中，我们使用自然语言处理提取编码和叙述概念，并使用惩罚逻辑回归生成算法，以分类明确或明确/可能的 SLE。我们在内部和外部队列验证中评估预测特征。我们还测试了在我们的 EHR 中使用预定义的 ICD-9 代码、实验室检查和药物排列的发表的基于规则的算法的性能特征。

结果

在特异性为 97%时，我们用于明确 SLE 的机器学习编码算法的阳性预测值（PPV）为 90%，敏感性为 64%，用于明确/可能 SLE 的算法的 PPV 为 92%，敏感性为 47%。在外部验证中，特异性为 97%时，明确/可能的算法的 PPV 为 94%，敏感性为 60%。添加 NLP 概念并未提高性能指标。发表的基于规则的算法的 PPV 在我们的 EHR 中范围为 45-79%。

结论

我们的机器学习 SLE 算法在内部和外部验证中表现良好。基于规则的 SLE 算法在我们的 EHR 中不能很好地传输。在应用算法识别 SLE 患者时，必须考虑 EHR 的独特特征、临床实践和研究目标，以及所需病例定义的敏感性和特异性。

相似文献

Identifying lupus patients in electronic health records: Development and validation of machine learning algorithms and application of rule-based algorithms.

Semin Arthritis Rheum. 2019 Aug;49(1):84-90. doi: 10.1016/j.semarthrit.2019.01.002. Epub 2019 Jan 4.

Rule-based and machine learning algorithms identify patients with systemic sclerosis accurately in the electronic health record.

Arthritis Res Ther. 2019 Dec 30;21(1):305. doi: 10.1186/s13075-019-2092-7.

Developing Electronic Health Record Algorithms That Accurately Identify Patients With Systemic Lupus Erythematosus.

Arthritis Care Res (Hoboken). 2017 May;69(5):687-693. doi: 10.1002/acr.22989. Epub 2017 Apr 10.

Developing and Validating Methods to Assemble Systemic Lupus Erythematosus Births in the Electronic Health Record.

Arthritis Care Res (Hoboken). 2022 May;74(5):849-857. doi: 10.1002/acr.24522. Epub 2022 Mar 16.

Natural language processing to identify lupus nephritis phenotype in electronic health records.

BMC Med Inform Decis Mak. 2024 Mar 3;22(Suppl 2):348. doi: 10.1186/s12911-024-02420-7.

Automatic generation of case-detection algorithms to identify children with asthma from large electronic health record databases.

Pharmacoepidemiol Drug Saf. 2013 Aug;22(8):826-33. doi: 10.1002/pds.3438. Epub 2013 Apr 17.

Evaluation of structured data from electronic health records to identify clinical classification criteria attributes for systemic lupus erythematosus.

Lupus Sci Med. 2021 Apr;8(1). doi: 10.1136/lupus-2021-000488.

Automated feature selection of predictors in electronic medical records data.

Biometrics. 2019 Mar;75(1):268-277. doi: 10.1111/biom.12987. Epub 2019 Apr 2.

Classifying Pseudogout Using Machine Learning Approaches With Electronic Health Record Data.

Arthritis Care Res (Hoboken). 2021 Mar;73(3):442-448. doi: 10.1002/acr.24132.

Development and validation of lupus nephritis case definitions using United States veterans affairs electronic health records.

Lupus. 2021 Mar;30(3):518-526. doi: 10.1177/0961203320973267. Epub 2020 Nov 11.

引用本文的文献

The Use of Machine Learning for Analyzing Real-World Data in Disease Prediction and Management: Systematic Review.

JMIR Med Inform. 2025 Jun 19;13:e68898. doi: 10.2196/68898.

A data-driven approach to discover and quantify systemic lupus erythematosus etiological heterogeneity from electronic health records.

AMIA Annu Symp Proc. 2025 May 22;2024:172-181. eCollection 2024.

Performance of the systemic lupus erythematosus risk probability index (SLERPI) in the Egyptian college of rheumatology (ECR) study cohort.

Clin Rheumatol. 2025 Jan;44(1):207-215. doi: 10.1007/s10067-024-07210-0. Epub 2024 Nov 4.

Validating claims-based algorithms for a systemic lupus erythematosus diagnosis in Medicare data for informed use of the Lupus Index: a tool for geospatial research.

Lupus Sci Med. 2024 Oct 14;11(2):e001329. doi: 10.1136/lupus-2024-001329.

Association Between Natural Hair Color, Race, and Alopecia.

Dermatol Ther (Heidelb). 2024 Aug;14(8):2109-2117. doi: 10.1007/s13555-024-01218-9. Epub 2024 Jul 2.

Patient Embeddings From Diagnosis Codes for Health Care Prediction Tasks: Pat2Vec Machine Learning Framework.

JMIR AI. 2023 Apr 21;2:e40755. doi: 10.2196/40755.

Hydroxychloroquine Dose and Hospitalizations for Active Lupus.

Arthritis Rheumatol. 2024 Oct;76(10):1512-1517. doi: 10.1002/art.42924. Epub 2024 Jun 21.

Comparison of late-onset and non-late-onset systemic lupus erythematosus individuals in a real-world electronic health record cohort.

Lupus. 2024 Apr;33(5):525-531. doi: 10.1177/09612033241238052. Epub 2024 Mar 7.

Systemic lupus in the era of machine learning medicine.

Lupus Sci Med. 2024 Mar 4;11(1):e001140. doi: 10.1136/lupus-2023-001140.

Autoimmune, Autoinflammatory Disease and Cutaneous Malignancy Associations with Hidradenitis Suppurativa: A Cross-Sectional Study.

Am J Clin Dermatol. 2024 May;25(3):473-484. doi: 10.1007/s40257-024-00844-5. Epub 2024 Feb 9.

本文引用的文献

Automated and flexible identification of complex disease: building a model for systemic lupus erythematosus using noisy labeling.

J Am Med Inform Assoc. 2019 Jan 1;26(1):61-65. doi: 10.1093/jamia/ocy154.

Surrogate-assisted feature extraction for high-throughput phenotyping.

J Am Med Inform Assoc. 2017 Apr 1;24(e1):e143-e149. doi: 10.1093/jamia/ocw135.

Developing Electronic Health Record Algorithms That Accurately Identify Patients With Systemic Lupus Erythematosus.

Arthritis Care Res (Hoboken). 2017 May;69(5):687-693. doi: 10.1002/acr.22989. Epub 2017 Apr 10.

The Biobank Portal for Partners Personalized Medicine: A Query Tool for Working with Consented Biobank Samples, Genotypes, and Phenotypes Using i2b2.

J Pers Med. 2016 Feb 26;6(1):11. doi: 10.3390/jpm6010011.

Extracting and standardizing medication information in clinical text - the MedEx-UIMA system.

AMIA Jt Summits Transl Sci Proc. 2014 Apr 7;2014:37-42. eCollection 2014.

Toward high-throughput phenotyping: unbiased automated feature extraction and selection from knowledge sources.

J Am Med Inform Assoc. 2015 Sep;22(5):993-1000. doi: 10.1093/jamia/ocv034. Epub 2015 Apr 29.

Development of phenotype algorithms using electronic medical records and incorporating natural language processing.

BMJ. 2015 Apr 24;350:h1885. doi: 10.1136/bmj.h1885.

A systematic review of validated methods for identifying systemic lupus erythematosus (SLE) using administrative or claims data.

Vaccine. 2013 Dec 30;31 Suppl 10:K62-73. doi: 10.1016/j.vaccine.2013.06.104.

Derivation and validation of the Systemic Lupus International Collaborating Clinics classification criteria for systemic lupus erythematosus.

Arthritis Rheum. 2012 Aug;64(8):2677-86. doi: 10.1002/art.34473.

Portability of an algorithm to identify rheumatoid arthritis in electronic health records.

J Am Med Inform Assoc. 2012 Jun;19(e1):e162-9. doi: 10.1136/amiajnl-2011-000583. Epub 2012 Feb 28.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

Suppr超能文献

在电子健康记录中识别狼疮患者：机器学习算法的开发和验证以及基于规则算法的应用。

Identifying lupus patients in electronic health records: Development and validation of machine learning algorithms and application of rule-based algorithms.

机构信息

出版信息

OBJECTIVE

METHODS

RESULTS

CONCLUSION

目的

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译