Suppr超能文献

一种基于新型句子变换器的自然语言处理方法,用于将电子健康记录映射到OMOP公共数据模型的模式映射。

A Novel Sentence Transformer-based Natural Language Processing Approach for Schema Mapping of Electronic Health Records to the OMOP Common Data Model.

作者信息

Zhou Xinyu, Dhingra Lovedeep Singh, Aminorroaya Arya, Adejumo Philip, Khera Rohan

机构信息

Department of Biostatistics, Yale School of Public Health, New Haven, CT, USA.

Yale School of Medicine, New Haven, CT, USA.

出版信息

AMIA Annu Symp Proc. 2025 May 22;2024:1332-1339. eCollection 2024.

Abstract

Mapping electronic health records (EHR) data to common data models (CDMs) enables the standardization of clinical records, enhancing interoperability and enabling large-scale, multi-centered clinical investigations. Using 2 large publicly available datasets, we developed transformer-based natural language processing models to map medication-related concepts from the EHR at a large and diverse healthcare system to standard concepts in OMOP CDM. We validated the model outputs against standard concepts manually mapped by clinicians. Our best model reached out-of-box accuracies of 96.5% in mapping the 200 most common drugs and 83.0% in mapping 200 random drugs in the EHR. For these tasks, this model outperformed a state-of-the-art large language model (SFR-Embedding-Mistral, 89.5% and 66.5% in accuracy for the two tasks), a widely used software for schema mapping (Usagi, 90.0% and 70.0% in accuracy), and direct string match (7.5% and 7.5% accuracy). Transformer-based deep learning models outperform existing approaches in the standardized mapping of EHR elements and can facilitate an end-to-end automated EHR transformation pipeline.

摘要

将电子健康记录(EHR)数据映射到通用数据模型(CDM)能够实现临床记录的标准化,增强互操作性,并支持大规模、多中心的临床研究。利用两个大型公开可用数据集,我们开发了基于Transformer的自然语言处理模型,以将来自大型多样化医疗系统中EHR的药物相关概念映射到OMOP CDM中的标准概念。我们对照临床医生手动映射的标准概念对模型输出进行了验证。我们的最佳模型在映射EHR中200种最常见药物时的开箱即用准确率达到96.5%,在映射200种随机药物时的准确率达到83.0%。对于这些任务,该模型优于一个先进的大语言模型(SFR-Embedding-Mistral,两项任务的准确率分别为89.5%和66.5%)、一个广泛使用的模式映射软件(Usagi,准确率分别为90.0%和70.0%)以及直接字符串匹配(准确率为7.5%和7.5%)。基于Transformer的深度学习模型在EHR元素的标准化映射方面优于现有方法,并且可以促进端到端的自动化EHR转换流程。

相似文献

本文引用的文献

1
A large language model for electronic health records.用于电子健康记录的大型语言模型。
NPJ Digit Med. 2022 Dec 26;5(1):194. doi: 10.1038/s41746-022-00742-2.
4
Magician's Corner: 9. Performance Metrics for Machine Learning Models.魔术师的角落:9. 机器学习模型的性能指标
Radiol Artif Intell. 2021 May 12;3(3):e200126. doi: 10.1148/ryai.2021200126. eCollection 2021 May.
7
Facilitating phenotype transfer using a common data model.利用通用数据模型促进表型转移。
J Biomed Inform. 2019 Aug;96:103253. doi: 10.1016/j.jbi.2019.103253. Epub 2019 Jul 17.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验