通过自上而下的信息提取将电子健康记录用于构建队列研究的二次利用。

Secondary use of electronic health records for building cohort studies through top-down information extraction.

作者信息

Kreuzthaler Markus, Schulz Stefan, Berghold Andrea

机构信息

Institute for Medical Informatics, Statistics and Documentation, Medical University of Graz, Austria.

出版信息

J Biomed Inform. 2015 Feb;53:188-95. doi: 10.1016/j.jbi.2014.10.010. Epub 2014 Nov 21.

DOI:10.1016/j.jbi.2014.10.010

PMID:25451102

Abstract

Controlled clinical trials are usually supported with an in-front data aggregation system, which supports the storage of relevant information according to the trial context within a highly structured environment. In contrast to the documentation of clinical trials, daily routine documentation has many characteristics that influence data quality. One such characteristic is the use of non-standardized text, which is an indispensable part of information representation in clinical information systems. Based on a cohort study we highlight challenges for mining electronic health records targeting free text entry fields within semi-structured data sources. Our prototypical information extraction system achieved an F-measure of 0.91 (precision=0.90, recall=0.93) for the training set and an F-measure of 0.90 (precision=0.89, recall=0.92) for the test set. We analyze the obtained results in detail and highlight challenges and future directions for the secondary use of routine data in general.

摘要

对照临床试验通常由一个前端数据聚合系统提供支持，该系统支持在高度结构化的环境中根据试验背景存储相关信息。与临床试验文档不同，日常常规文档具有许多影响数据质量的特征。其中一个特征是使用非标准化文本，这是临床信息系统中信息表示不可或缺的一部分。基于一项队列研究，我们强调了在半结构化数据源中挖掘针对自由文本输入字段的电子健康记录所面临的挑战。我们的原型信息提取系统在训练集上的F值为0.91（精确率=0.90，召回率=0.93），在测试集上的F值为0.90（精确率=0.89，召回率=0.92）。我们详细分析了所得结果，并总体上强调了常规数据二次使用所面临的挑战和未来方向。

相似文献

Secondary use of electronic health records for building cohort studies through top-down information extraction.通过自上而下的信息提取将电子健康记录用于构建队列研究的二次利用。

J Biomed Inform. 2015 Feb;53:188-95. doi: 10.1016/j.jbi.2014.10.010. Epub 2014 Nov 21.

A method for cohort selection of cardiovascular disease records from an electronic health record system.一种从电子健康记录系统中选择心血管疾病记录队列的方法。

Int J Med Inform. 2017 Jun;102:138-149. doi: 10.1016/j.ijmedinf.2017.03.015. Epub 2017 Mar 30.

Roogle: an information retrieval engine for clinical data warehouse.Roogle：一种用于临床数据仓库的信息检索引擎。

Stud Health Technol Inform. 2011;169:584-8.

Extracting important information from Chinese Operation Notes with natural language processing methods.运用自然语言处理方法从中文手术记录中提取重要信息。

J Biomed Inform. 2014 Apr;48:130-6. doi: 10.1016/j.jbi.2013.12.017. Epub 2014 Jan 31.

Ontology-based clinical information extraction from physician's free-text notes.基于本体的医生自由文本记录中临床信息抽取。

J Biomed Inform. 2019 Oct;98:103276. doi: 10.1016/j.jbi.2019.103276. Epub 2019 Aug 29.

An architecture for biological information extraction and representation.一种用于生物信息提取与表示的架构。

Bioinformatics. 2005 Feb 15;21(4):430-8. doi: 10.1093/bioinformatics/bti187. Epub 2004 Dec 17.

[Information extraction methodology used in electronic medical records].电子病历中使用的信息提取方法

Zhongguo Yi Liao Qi Xie Za Zhi. 2011 Jan;35(1):39-41.

Lexical patterns, features and knowledge resources for coreference resolution in clinical notes.临床笔记中用于指代消解的词汇模式、特征和知识资源。

J Biomed Inform. 2012 Oct;45(5):901-12. doi: 10.1016/j.jbi.2012.02.012. Epub 2012 Mar 17.

Building a common pipeline for rule-based document classification.构建用于基于规则的文档分类的通用管道。

Stud Health Technol Inform. 2013;192:1211.

Rule-based information extraction from patients' clinical data.基于规则的患者临床数据信息抽取。

J Biomed Inform. 2009 Oct;42(5):923-36. doi: 10.1016/j.jbi.2009.07.007. Epub 2009 Jul 29.

引用本文的文献

Electronic health record data quality assessment and tools: a systematic review.电子健康记录数据质量评估及工具：系统综述。

J Am Med Inform Assoc. 2023 Sep 25;30(10):1730-1740. doi: 10.1093/jamia/ocad120.

Catch Me if You Can: Acute Events Hidden in Structured Chronic Disease Diagnosis Descriptions Show Detectable Recording Patterns in EHR.《能抓到我算你本事：隐藏在结构化慢性病诊断描述中的急性事件在电子健康记录中呈现出可检测的记录模式》

AMIA Annu Symp Proc. 2021 Jan 25;2020:373-382. eCollection 2020.

Combining structured and unstructured data in EMRs to create clinically-defined EMR-derived cohorts.将电子病历中的结构化和非结构化数据相结合，创建临床定义的电子病历衍生队列。

BMC Med Inform Decis Mak. 2021 Mar 8;21(1):91. doi: 10.1186/s12911-021-01441-w.

Cohort Selection for Clinical Trials From Longitudinal Patient Records: Text Mining Approach.基于纵向患者记录的临床试验队列选择：文本挖掘方法

JMIR Med Inform. 2019 Oct 31;7(4):e15980. doi: 10.2196/15980.

Improving a Secondary Use Health Data Warehouse: Proposing a Multi-Level Data Quality Framework.改进二级使用健康数据仓库：提出一个多层次数据质量框架。

EGEMS (Wash DC). 2019 Aug 2;7(1):38. doi: 10.5334/egems.298.

Clinical Natural Language Processing in languages other than English: opportunities and challenges.非英语语言的临床自然语言处理：机遇与挑战。

J Biomed Semantics. 2018 Mar 30;9(1):12. doi: 10.1186/s13326-018-0179-8.

Update on Data Reuse in Health Care.医疗保健领域数据再利用的最新情况。

Yearb Med Inform. 2017 Aug;26(1):24-27. doi: 10.15265/IY-2017-013. Epub 2017 Sep 11.

Natural language processing systems for capturing and standardizing unstructured clinical information: A systematic review.用于捕获和标准化非结构化临床信息的自然语言处理系统：一项系统综述。

J Biomed Inform. 2017 Sep;73:14-29. doi: 10.1016/j.jbi.2017.07.012. Epub 2017 Jul 17.

Clinical Data Reuse or Secondary Use: Current Status and Potential Future Progress.临床数据的再利用或二次使用：现状与未来潜在进展

Yearb Med Inform. 2017 Aug;26(1):38-52. doi: 10.15265/IY-2017-007. Epub 2017 Sep 11.

Aspiring to Unintended Consequences of Natural Language Processing: A Review of Recent Developments in Clinical and Consumer-Generated Text Processing.探究自然语言处理的意外后果：临床及用户生成文本处理的最新进展综述

Yearb Med Inform. 2016 Nov 10(1):224-233. doi: 10.15265/IY-2016-017.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

通过自上而下的信息提取将电子健康记录用于构建队列研究的二次利用。

Secondary use of electronic health records for building cohort studies through top-down information extraction.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献