Suppr超能文献

开发一个基于FHIR的可扩展临床数据标准化管道,用于对非结构化和结构化电子健康记录数据进行标准化和整合。

Developing a scalable FHIR-based clinical data normalization pipeline for standardizing and integrating unstructured and structured electronic health record data.

作者信息

Hong Na, Wen Andrew, Shen Feichen, Sohn Sunghwan, Wang Chen, Liu Hongfang, Jiang Guoqian

机构信息

Department of Health Sciences Research, Mayo Clinic, Rochester, Minnesota, USA.

出版信息

JAMIA Open. 2019 Oct 18;2(4):570-579. doi: 10.1093/jamiaopen/ooz056. eCollection 2019 Dec.

Abstract

OBJECTIVE

To design, develop, and evaluate a scalable clinical data normalization pipeline for standardizing unstructured electronic health record (EHR) data leveraging the HL7 Fast Healthcare Interoperability Resources (FHIR) specification.

METHODS

We established an FHIR-based clinical data normalization pipeline known as NLP2FHIR that mainly comprises: (1) a module for a core natural language processing (NLP) engine with an FHIR-based type system; (2) a module for integrating structured data; and (3) a module for content normalization. We evaluated the FHIR modeling capability focusing on core clinical resources such as Condition, Procedure, MedicationStatement (including Medication), and FamilyMemberHistory using Mayo Clinic's unstructured EHR data. We constructed a gold standard reusing annotation corpora from previous NLP projects.

RESULTS

A total of 30 mapping rules, 62 normalization rules, and 11 NLP-specific FHIR extensions were created and implemented in the NLP2FHIR pipeline. The elements that need to integrate structured data from each clinical resource were identified. The performance of unstructured data modeling achieved scores ranging from 0.69 to 0.99 for various FHIR element representations (0.69-0.99 for Condition; 0.75-0.84 for Procedure; 0.71-0.99 for MedicationStatement; and 0.75-0.95 for FamilyMemberHistory).

CONCLUSION

We demonstrated that the NLP2FHIR pipeline is feasible for modeling unstructured EHR data and integrating structured elements into the model. The outcomes of this work provide standards-based tools of clinical data normalization that is indispensable for enabling portable EHR-driven phenotyping and large-scale data analytics, as well as useful insights for future developments of the FHIR specifications with regard to handling unstructured clinical data.

摘要

目的

设计、开发并评估一种可扩展的临床数据标准化流程,用于利用HL7快速医疗保健互操作性资源(FHIR)规范对非结构化电子健康记录(EHR)数据进行标准化。

方法

我们建立了一个基于FHIR的临床数据标准化流程,称为NLP2FHIR,主要包括:(1)一个带有基于FHIR的类型系统的核心自然语言处理(NLP)引擎模块;(2)一个用于整合结构化数据的模块;(3)一个用于内容标准化的模块。我们使用梅奥诊所的非结构化EHR数据,评估了以核心临床资源如病情、手术、用药声明(包括药物)和家族病史为重点的FHIR建模能力。我们通过复用先前NLP项目的注释语料库构建了一个黄金标准。

结果

在NLP2FHIR流程中总共创建并实施了30条映射规则、62条标准化规则和11个特定于NLP的FHIR扩展。确定了需要整合来自每个临床资源的结构化数据的元素。对于各种FHIR元素表示,非结构化数据建模的性能得分在0.69至0.99之间(病情为0.69 - 0.99;手术为0.75 - 0.84;用药声明为0.71 - 0.99;家族病史为0.75 - 0.95)。

结论

我们证明了NLP2FHIR流程对于非结构化EHR数据建模以及将结构化元素整合到模型中是可行的。这项工作的成果提供了基于标准的临床数据标准化工具,这对于实现便携式EHR驱动的表型分析和大规模数据分析是不可或缺的,同时也为FHIR规范在处理非结构化临床数据方面的未来发展提供了有用的见解。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3706/6993992/36869db138b1/ooz056f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验