Suppr超能文献

临床概念识别:对电子健康记录现有系统的评估。

Clinical concept recognition: Evaluation of existing systems on EHRs.

作者信息

Lossio-Ventura Juan Antonio, Sun Ran, Boussard Sebastien, Hernandez-Boussard Tina

机构信息

Biomedical Informatics Research, Stanford University, Stanford, CA, United States.

National Institute of Mental Health, National Institutes of Health, Bethesda, MD, United States.

出版信息

Front Artif Intell. 2023 Jan 13;5:1051724. doi: 10.3389/frai.2022.1051724. eCollection 2022.

Abstract

OBJECTIVE

The adoption of electronic health records (EHRs) has produced enormous amounts of data, creating research opportunities in clinical data sciences. Several concept recognition systems have been developed to facilitate clinical information extraction from these data. While studies exist that compare the performance of many concept recognition systems, they are typically developed internally and may be biased due to different internal implementations, parameters used, and limited number of systems included in the evaluations. The goal of this research is to evaluate the performance of existing systems to retrieve relevant clinical concepts from EHRs.

METHODS

We investigated six concept recognition systems, including CLAMP, cTAKES, MetaMap, NCBO Annotator, QuickUMLS, and ScispaCy. Clinical concepts extracted included procedures, disorders, medications, and anatomical location. The system performance was evaluated on two datasets: the 2010 i2b2 and the MIMIC-III. Additionally, we assessed the performance of these systems in five challenging situations, including negation, severity, abbreviation, ambiguity, and misspelling.

RESULTS

For clinical concept extraction, CLAMP achieved the best performance on exact and inexact matching, with an F-score of 0.70 and 0.94, respectively, on i2b2; and 0.39 and 0.50, respectively, on MIMIC-III. Across the five challenging situations, ScispaCy excelled in extracting abbreviation information (F-score: 0.86) followed by NCBO Annotator (F-score: 0.79). CLAMP outperformed in extracting severity terms (F-score 0.73) followed by NCBO Annotator (F-score: 0.68). CLAMP outperformed other systems in extracting negated concepts (F-score 0.63).

CONCLUSIONS

Several concept recognition systems exist to extract clinical information from unstructured data. This study provides an external evaluation by end-users of six commonly used systems across different extraction tasks. Our findings suggest that CLAMP provides the most comprehensive set of annotations for clinical concept extraction tasks and associated challenges. Comparing standard extraction tasks across systems provides guidance to other clinical researchers when selecting a concept recognition system relevant to their clinical information extraction task.

摘要

目的

电子健康记录(EHRs)的采用产生了大量数据,为临床数据科学创造了研究机会。已经开发了几种概念识别系统,以促进从这些数据中提取临床信息。虽然存在比较许多概念识别系统性能的研究,但它们通常是内部开发的,可能由于不同的内部实现、使用的参数以及评估中包含的系统数量有限而存在偏差。本研究的目的是评估现有系统从电子健康记录中检索相关临床概念的性能。

方法

我们研究了六个概念识别系统,包括CLAMP、cTAKES、MetaMap、NCBO注释器、QuickUMLS和ScispaCy。提取的临床概念包括手术、疾病、药物和解剖位置。在两个数据集上评估系统性能:2010年i2b2数据集和MIMIC-III数据集。此外,我们评估了这些系统在五种具有挑战性的情况下的性能,包括否定、严重程度、缩写、歧义性和拼写错误。

结果

对于临床概念提取,CLAMP在精确匹配和不精确匹配方面表现最佳,在i2b2数据集上的F值分别为0.70和0.94;在MIMIC-III数据集上分别为0.39和0.50。在五种具有挑战性的情况下,ScispaCy在提取缩写信息方面表现出色(F值:0.86),其次是NCBO注释器(F值:0.79)。CLAMP在提取严重程度术语方面表现优于其他系统(F值0.73),其次是NCBO注释器(F值:0.68)。CLAMP在提取否定概念方面优于其他系统(F值0.63)。

结论

存在几种从非结构化数据中提取临床信息的概念识别系统。本研究由最终用户对六个常用系统在不同提取任务上进行了外部评估。我们的研究结果表明,CLAMP为临床概念提取任务和相关挑战提供了最全面的注释集。跨系统比较标准提取任务为其他临床研究人员在选择与其临床信息提取任务相关的概念识别系统时提供了指导。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b0bb/9880223/d687c4c65fe7/frai-05-1051724-g0001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验