Suppr超能文献

使用自然语言处理技术在真实世界的希伯来语自由文本电子病历中识别糖尿病相关并发症。

Identifying Diabetes Related-Complications in a Real-World Free-Text Electronic Medical Records in Hebrew Using Natural Language Processing Techniques.

作者信息

Saban Mor, Lutski Miri, Zucker Inbar, Uziel Moshe, Ben-Moshe Dror, Israel Ariel, Vinker Shlomo, Golan-Cohen Avivit, Laufer Izhar, Green Ilan, Eldor Roy, Merzon Eugene

机构信息

Nursing Department, School of Health Professions, Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel.

The Israel Center for Disease Control, Ministry of Health, Ramat Gan, Israel.

出版信息

J Diabetes Sci Technol. 2024 Jan 30:19322968241228555. doi: 10.1177/19322968241228555.

Abstract

BACKGROUND

Studies have demonstrated that 50% to 80% of patients do not receive an International Classification of Diseases (ICD) code assigned to their medical encounter or condition. For these patients, their clinical information is mostly recorded as unstructured free-text narrative data in the medical record without standardized coding or extraction of structured data elements. Leumit Health Services (LHS) in collaboration with the Israeli Ministry of Health (MoH) conducted this study using electronic medical records (EMRs) to systematically extract meaningful clinical information about people with diabetes from the unstructured free-text notes.

OBJECTIVES

To develop and validate natural language processing (NLP) algorithms to identify diabetes-related complications in the free-text medical records of patients who have LHS membership.

METHODS

The study data included 2.3 million records of 41 469 patients with diabetes aged 35 or older between the years 2012 and 2017. The diabetes related complications included cardiovascular disease, diabetic neuropathy, nephropathy, retinopathy, diabetic foot, cognitive impairments, mood disorders and hypoglycemia. A vocabulary list of terms was determined and adjudicated by two physicians who are experienced in diabetes care board certified diabetes specialist in endocrinology or family medicine. Two independent registered nurses with PhDs reviewed the free-text medical records. Both rule-based and machine learning techniques were used for the NLP algorithm development. Precision, recall, and -score were calculated to compare the performance of (1) the NLP algorithm with the reviewers' comments and (2) the ICD codes with the reviewers' comments for each complication.

RESULTS

The NLP algorithm versus the reviewers (gold standard) achieved an overall good performance with a mean -score of 86%. This was better than the ICD codes which achieved a mean -score of only 51%.

CONCLUSION

NLP algorithms and machine learning processes may enable more accurate identification of diabetes complications in EMR data.

摘要

背景

研究表明,50%至80%的患者在就医时未被分配国际疾病分类(ICD)代码。对于这些患者,他们的临床信息大多以非结构化的自由文本叙述数据形式记录在病历中,没有进行标准化编码或提取结构化数据元素。Leumit健康服务机构(LHS)与以色列卫生部(MoH)合作开展了这项研究,利用电子病历(EMR)从非结构化的自由文本记录中系统地提取有关糖尿病患者的有意义的临床信息。

目的

开发并验证自然语言处理(NLP)算法,以识别LHS会员患者的自由文本病历中的糖尿病相关并发症。

方法

研究数据包括2012年至2017年间41469名35岁及以上糖尿病患者的230万条记录。糖尿病相关并发症包括心血管疾病、糖尿病神经病变、肾病、视网膜病变、糖尿病足、认知障碍、情绪障碍和低血糖。由两名在内分泌学或家庭医学领域具有糖尿病护理经验且获得糖尿病专科委员会认证的医生确定并裁定术语词汇表。两名拥有博士学位的独立注册护士对自由文本病历进行了审查。基于规则和机器学习技术均用于NLP算法开发。计算精度、召回率和F1分数,以比较(1)NLP算法与审查员评论的性能,以及(2)每种并发症的ICD代码与审查员评论的性能。

结果

NLP算法与审查员(金标准)相比总体表现良好,平均F1分数为86%。这优于ICD代码,ICD代码的平均F1分数仅为51%。

结论

NLP算法和机器学习过程可能有助于更准确地识别EMR数据中的糖尿病并发症。

相似文献

本文引用的文献

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验