Suppr超能文献

当前从电子健康记录中识别临床叙述部分的方法:系统评价。

Current approaches to identify sections within clinical narratives from electronic health records: a systematic review.

机构信息

Pontificia Universidad Javeriana, Cra. 7 No 40-62, Bogotá, 110231, Colombia.

Institute for Medical Informatics, Statistics and Documentation, Medical University of Graz, Auenbruggerplatz 2, Graz, 8036, Austria.

出版信息

BMC Med Res Methodol. 2019 Jul 18;19(1):155. doi: 10.1186/s12874-019-0792-y.

Abstract

BACKGROUND

The identification of sections in narrative content of Electronic Health Records (EHR) has demonstrated to improve the performance of clinical extraction tasks; however, there is not yet a shared understanding of the concept and its existing methods. The objective is to report the results of a systematic review concerning approaches aimed at identifying sections in narrative content of EHR, using both automatic or semi-automatic methods.

METHODS

This review includes articles from the databases: SCOPUS, Web of Science and PubMed (from January 1994 to September 2018). The selection of studies was done using predefined eligibility criteria and applying the PRISMA recommendations. Search criteria were elaborated by using an iterative and collaborative keyword enrichment.

RESULTS

Following the eligibility criteria, 39 studies were selected for analysis. The section identification approaches proposed by these studies vary greatly depending on the kind of narrative, the type of section, and the application. We observed that 57% of them proposed formal methods for identifying sections and 43% adapted a previously created method. Seventy-eight percent were intended for English texts and 41% for discharge summaries. Studies that are able to identify explicit (with headings) and implicit sections correspond to 46%. Regarding the level of granularity, 54% of the studies are able to identify sections, but not subsections. From the technical point of view, the methods can be classified into rule-based methods (59%), machine learning methods (22%) and a combination of both (19%). Hybrid methods showed better results than those relying on pure machine learning approaches, but lower than rule-based methods; however, their scope was more ambitious than the latter ones. Despite all the promising performance results, very few studies reported tests under a formal setup. Almost all the studies relied on custom dictionaries; however, they used them in conjunction with a controlled terminology, most commonly the UMLSⓇ metathesaurus.

CONCLUSIONS

Identification of sections in EHR narratives is gaining popularity for improving clinical extraction projects. This study enabled the community working on clinical NLP to gain a formal analysis of this task, including the most successful ways to perform it.

摘要

背景

电子健康记录(EHR)叙事内容中章节的识别已被证明可以提高临床提取任务的性能;但是,对于该概念及其现有方法,目前还没有达成共识。本研究旨在报告一项系统评价的结果,该评价涉及使用自动或半自动方法识别 EHR 叙事内容中章节的方法。

方法

本研究纳入了来自 SCOPUS、Web of Science 和 PubMed 数据库的文章(1994 年 1 月至 2018 年 9 月)。通过使用预定义的纳入标准和应用 PRISMA 建议,对研究进行了选择。通过迭代和协作的关键词丰富,制定了搜索标准。

结果

根据纳入标准,最终有 39 项研究被纳入分析。这些研究提出的章节识别方法因叙事类型、章节类型和应用而异。我们发现,其中 57%的研究提出了用于识别章节的正式方法,而 43%的研究则采用了先前创建的方法。78%的研究适用于英文文本,41%的研究适用于出院小结。能够识别明确(有标题)和隐含章节的研究占 46%。关于粒度级别,54%的研究能够识别章节,但不能识别子章节。从技术角度来看,这些方法可分为基于规则的方法(59%)、机器学习方法(22%)和两者的组合(19%)。混合方法的表现优于仅依赖于纯机器学习方法的方法,但低于基于规则的方法;然而,它们的应用范围比后者更为广泛。尽管所有这些方法都表现出了很有前景的性能结果,但很少有研究报告在正式设置下进行的测试。几乎所有的研究都依赖于自定义词典;但是,他们将其与受控术语(最常见的是 UMLS Ⓡ 语义网络)结合使用。

结论

在 EHR 叙事中识别章节对于改善临床提取项目越来越受欢迎。本研究使从事临床自然语言处理工作的社区能够对该任务进行正式分析,包括执行该任务的最成功方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7ed5/6637496/7837bf7ea7f0/12874_2019_792_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验