Suppr超能文献

寓言:一种半监督处方信息提取系统。

FABLE: A Semi-Supervised Prescription Information Extraction System.

作者信息

Tao Carson, Filannino Michele, Uzuner Özlem

机构信息

SUNY at Albany, Albany, NY, USA.

George Mason University, Fairfax, Virginia, USA.

出版信息

AMIA Annu Symp Proc. 2018 Dec 5;2018:1534-1543. eCollection 2018.

Abstract

Prescription information is an important component of electronic health records (EHRs). This information contains detailed medication instructions that are crucial for patients' well-being and is often detailed in the narrative portions of EHRs. As a result, narratives of EHRs need to be processed with natural language processing (NLP) methods that can extract medication and prescription information from free text. However, automatic methods for medication and prescription extraction from narratives face two major challenges: (1) dictionaries can fall short even when identifying well-defined and syntactically consistent categories of medication entities, (2) some categories of medication entities are sparse, and at the same time lexically (and syntactically) diverse. In this paper, we describe FABLE, a system for automatically extracting prescription information from discharge summaries. FABLE utilizes unannotated data to enhance annotated training data: it performs semi-supervised extraction of medication information using pseudo-labels with Conditional Random Fields (CRFs) to improve its understanding of incomplete, sparse, and diverse medication entities. When evaluated against the official benchmark set from the 2009 i2b2 Shared Task and Workshop on Medication Extraction, FABLE achieves a horizontal phrase-level F1-measure of 0.878, giving state-of-the-art performance and significantly improving on nearly all entity categories.

摘要

处方信息是电子健康记录(EHR)的重要组成部分。该信息包含对患者健康至关重要的详细用药说明,且通常在电子健康记录的叙述部分中有详细记录。因此,电子健康记录的叙述需要使用能够从自由文本中提取用药和处方信息的自然语言处理(NLP)方法进行处理。然而,从叙述中自动提取用药和处方信息的方法面临两个主要挑战:(1)即使在识别定义明确且句法一致的用药实体类别时,词典也可能不够用;(2)某些用药实体类别稀疏,同时在词汇(和句法)上具有多样性。在本文中,我们描述了FABLE,一种用于从出院小结中自动提取处方信息的系统。FABLE利用未标注数据来增强已标注的训练数据:它使用带有条件随机场(CRF)的伪标签进行用药信息的半监督提取,以提高对不完整、稀疏和多样的用药实体的理解。在针对2009年i2b2药物提取共享任务和研讨会上设定的官方基准进行评估时,FABLE在水平短语级别的F1值达到了0.878,给出了当前的最优性能,并且几乎在所有实体类别上都有显著提升。

相似文献

2
Prescription extraction using CRFs and word embeddings.使用条件随机场和词嵌入进行处方提取。
J Biomed Inform. 2017 Aug;72:60-66. doi: 10.1016/j.jbi.2017.07.002. Epub 2017 Jul 4.
7
Extracting medication information from clinical text.从临床文本中提取药物信息。
J Am Med Inform Assoc. 2010 Sep-Oct;17(5):514-8. doi: 10.1136/jamia.2010.003947.

本文引用的文献

1
Prescription extraction using CRFs and word embeddings.使用条件随机场和词嵌入进行处方提取。
J Biomed Inform. 2017 Aug;72:60-66. doi: 10.1016/j.jbi.2017.07.002. Epub 2017 Jul 4.
3
2010 i2b2/VA challenge on concepts, assertions, and relations in clinical text.2010 i2b2/VA 挑战赛:临床文本中的概念、断言和关系
J Am Med Inform Assoc. 2011 Sep-Oct;18(5):552-6. doi: 10.1136/amiajnl-2011-000203. Epub 2011 Jun 16.
6
Extracting medication information from clinical text.从临床文本中提取药物信息。
J Am Med Inform Assoc. 2010 Sep-Oct;17(5):514-8. doi: 10.1136/jamia.2010.003947.
7
An overview of MetaMap: historical perspective and recent advances.MetaMap 概述:历史视角与最新进展。
J Am Med Inform Assoc. 2010 May-Jun;17(3):229-36. doi: 10.1136/jamia.2009.002733.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验