自然语言处理在癌症电子健康记录信息提取中的性能：系统评价

Performance of Natural Language Processing for Information Extraction From Electronic Health Records Within Cancer: Systematic Review.

作者信息

Dahl Simon, Bøgsted Martin, Sagi Tomer, Vesteghem Charles

机构信息

Center for Clinical Data Science, Department of Clinical Medicine, Aalborg University, Selma Lagerløfs Vej 249, Gistrup, 9260, Denmark, +45 99407244.

Center for Clinical Data Science, Research, Education and Innovation, Aalborg University Hospital, Aalborg, Denmark.

出版信息

JMIR Med Inform. 2025 Sep 12;13:e68707. doi: 10.2196/68707.

DOI:10.2196/68707

PMID:40939201

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12431712/

Abstract

BACKGROUND

Over the last decade, natural language processing (NLP) has provided various solutions for information extraction (IE) from textual clinical data. In recent years, the use of NLP in cancer research has gained considerable attention, with numerous studies exploring the effectiveness of various NLP techniques for identifying and extracting cancer-related entities from clinical text data.

OBJECTIVE

We aimed to summarize the performance differences between various NLP models for IE within the context of cancer to provide an overview of the relative performance of existing models.

METHODS

This systematic literature review was conducted using 3 databases (PubMed, Scopus, and Web of Science) to search for articles extracting cancer-related entities from clinical texts. In total, 33 articles were eligible for inclusion. We extracted NLP models and their performance by F1-scores. Each model was categorized into the following categories: rule-based, traditional machine learning, conditional random field-based, neural network, and bidirectional transformer (BT). The average of the performance difference for each combination of categorizations was calculated across all articles.

RESULTS

The articles covered various scenarios, with the best performance for each article ranging from 0.355 to 0.985 in F1-score. Examining the overall relative performances, the BT category outperformed every other category (average F1-score between 0.2335 and 0.0439). The percentage of articles on implementing BTs has increased over the years.

CONCLUSIONS

NLP has demonstrated the ability to identify and extract cancer-related entities from unstructured textual data. Generally, more advanced models outperform less advanced ones. The BT category performed the best.

摘要

背景

在过去十年中，自然语言处理（NLP）为从文本临床数据中提取信息（IE）提供了各种解决方案。近年来，NLP在癌症研究中的应用受到了广泛关注，众多研究探索了各种NLP技术从临床文本数据中识别和提取癌症相关实体的有效性。

目的

我们旨在总结癌症背景下各种NLP模型在信息提取方面的性能差异，以概述现有模型的相对性能。

方法

本系统文献综述使用3个数据库（PubMed、Scopus和Web of Science）搜索从临床文本中提取癌症相关实体的文章。总共有33篇文章符合纳入标准。我们通过F1分数提取NLP模型及其性能。每个模型分为以下几类：基于规则的、传统机器学习、基于条件随机场的、神经网络和双向变压器（BT）。计算所有文章中每个分类组合的性能差异平均值。