Suppr
超能文献

用于癌症研究中分析电子健康记录和临床笔记的自然语言处理：综述

NLP for Analyzing Electronic Health Records and Clinical Notes in Cancer Research: A Review.

作者信息

Bilal Muhammad, Hamza Ameer, Malik Nadia

机构信息

Department of Pharmaceutical Outcomes and Policy (M.B.), University of Florida, Gainesville, Florida, USA; Department of Software Engineering (M.B.), National University of Computer and Emerging Sciences, Islamabad, Pakistan.

Department of Computer Science (A.H.), Faculty of Computing and IT, University of Sargodha, Sargodha, Punjab, Pakistan.

出版信息

J Pain Symptom Manage. 2025 May;69(5):e374-e394. doi: 10.1016/j.jpainsymman.2025.01.019. Epub 2025 Jan 31.

DOI:10.1016/j.jpainsymman.2025.01.019

PMID:39894080

Abstract

This review examines the application of natural language processing (NLP) techniques in cancer research using electronic health records (EHRs) and clinical notes. It addresses gaps in existing literature by providing a broader perspective than previous studies focused on specific cancer types or applications. A comprehensive literature search in the Scopus database identified 94 relevant studies published between 2019 and 2024. The analysis revealed a growing trend in NLP applications for cancer research, with information extraction (47 studies) and text classification (40 studies) emerging as predominant NLP tasks, followed by named entity recognition (7 studies). Among cancer types, breast, lung, and colorectal cancers were found to be the most studied. A significant shift from rule-based and traditional machine learning approaches to advanced deep learning techniques and transformer-based models was observed. It was found that dataset sizes used in existing studies varied widely, ranging from small, manually annotated datasets to large-scale EHRs. The review highlighted key challenges, including the limited generalizability of proposed solutions and the need for improved integration into clinical workflows. While NLP techniques show significant potential in analyzing EHRs and clinical notes for cancer research, future work should focus on improving model generalizability, enhancing robustness in handling complex clinical language, and expanding applications to understudied cancer types. The integration of NLP tools into palliative medicine and addressing ethical considerations remain crucial for utilizing the full potential of NLP in enhancing cancer diagnosis, treatment, and patient outcomes. This review provides valuable insights into the current state and future directions of NLP applications in cancer research.

摘要

本综述探讨了自然语言处理（NLP）技术在利用电子健康记录（EHR）和临床笔记进行癌症研究中的应用。它通过提供比以往专注于特定癌症类型或应用的研究更广泛的视角，弥补了现有文献中的空白。在Scopus数据库中进行的全面文献检索确定了2019年至2024年间发表的94项相关研究。分析显示，NLP在癌症研究中的应用呈增长趋势，信息提取（47项研究）和文本分类（40项研究）成为主要的NLP任务，其次是命名实体识别（7项研究）。在癌症类型中，乳腺癌、肺癌和结直肠癌的研究最多。观察到从基于规则和传统机器学习方法到先进深度学习技术和基于Transformer的模型的显著转变。研究发现，现有研究中使用的数据集大小差异很大，从小规模的人工标注数据集到大规模的电子健康记录不等。该综述强调了关键挑战，包括所提出解决方案的泛化性有限以及需要更好地整合到临床工作流程中。虽然NLP技术在分析电子健康记录和临床笔记以进行癌症研究方面显示出巨大潜力，但未来的工作应集中在提高模型的泛化性、增强处理复杂临床语言的鲁棒性以及将应用扩展到研究较少的癌症类型上。将NLP工具整合到姑息治疗中并解决伦理问题对于充分发挥NLP在改善癌症诊断、治疗和患者预后方面的潜力仍然至关重要。本综述为NLP在癌症研究中的当前状态和未来方向提供了有价值的见解。

相似文献

NLP for Analyzing Electronic Health Records and Clinical Notes in Cancer Research: A Review.

J Pain Symptom Manage. 2025 May;69(5):e374-e394. doi: 10.1016/j.jpainsymman.2025.01.019. Epub 2025 Jan 31.

Enhancing suicidal behavior detection in EHRs: A multi-label NLP framework with transformer models and semantic retrieval-based annotation.

J Biomed Inform. 2025 Jan;161:104755. doi: 10.1016/j.jbi.2024.104755. Epub 2024 Dec 2.

Natural Language Processing of Clinical Notes on Chronic Diseases: Systematic Review.

JMIR Med Inform. 2019 Apr 27;7(2):e12239. doi: 10.2196/12239.

Natural Language Processing in Electronic Health Records in relation to healthcare decision-making: A systematic review.

Comput Biol Med. 2023 Mar;155:106649. doi: 10.1016/j.compbiomed.2023.106649. Epub 2023 Feb 10.

Challenges of Developing a Natural Language Processing Method With Electronic Health Records to Identify Persons With Chronic Mobility Disability.

Arch Phys Med Rehabil. 2020 Oct;101(10):1739-1746. doi: 10.1016/j.apmr.2020.04.024. Epub 2020 May 21.

Using natural language processing to analyze unstructured patient-reported outcomes data derived from electronic health records for cancer populations: a systematic review.

Expert Rev Pharmacoecon Outcomes Res. 2024 Apr;24(4):467-475. doi: 10.1080/14737167.2024.2322664. Epub 2024 Mar 5.

Natural language processing with machine learning methods to analyze unstructured patient-reported outcomes derived from electronic health records: A systematic review.

Artif Intell Med. 2023 Dec;146:102701. doi: 10.1016/j.artmed.2023.102701. Epub 2023 Nov 1.

Extraction of sleep information from clinical notes of Alzheimer's disease patients using natural language processing.

J Am Med Inform Assoc. 2024 Oct 1;31(10):2217-2227. doi: 10.1093/jamia/ocae177.

A frame semantic overview of NLP-based information extraction for cancer-related EHR notes.

J Biomed Inform. 2019 Dec;100:103301. doi: 10.1016/j.jbi.2019.103301. Epub 2019 Oct 4.

Identification of Preanesthetic History Elements by a Natural Language Processing Engine.

Anesth Analg. 2022 Dec 1;135(6):1162-1171. doi: 10.1213/ANE.0000000000006152. Epub 2022 Jul 15.

引用本文的文献

Performance of Natural Language Processing for Information Extraction From Electronic Health Records Within Cancer: Systematic Review.

JMIR Med Inform. 2025 Sep 12;13:e68707. doi: 10.2196/68707.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

Suppr超能文献

用于癌症研究中分析电子健康记录和临床笔记的自然语言处理：综述

NLP for Analyzing Electronic Health Records and Clinical Notes in Cancer Research: A Review.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译