Suppr超能文献

Natural language processing in medical text processing: A scoping literature review.

作者信息

Elvas Luis B, Almeida Ana, Ferreira João C

机构信息

Department of Logistics, Molde University College, Molde 6410, Norway; Inov Inesc Inovação - Instituto de Novas Tecnologias, 1000-029 Lisbon, Portugal; Breast Cancer Research Program, Champalimaud Foundation, Lisbon, Portugal; ISTAR, Instituto Universitário de Lisboa (ISCTE-IUL), 1649-026 Lisbon, Portugal.

ISTAR, Instituto Universitário de Lisboa (ISCTE-IUL), 1649-026 Lisbon, Portugal.

出版信息

Int J Med Inform. 2025 Dec;204:106049. doi: 10.1016/j.ijmedinf.2025.106049. Epub 2025 Jul 17.

Abstract

BACKGROUND

The exponential growth of digitized medical data has created significant challenges for healthcare professionals, as medical documentation transitions from simple text records to complex, multi-dimensional data structures. Natural Language Processing (NLP), particularly Named Entity Recognition (NER), has emerged as a crucial tool for extracting and categorizing critical information from clinical texts. The development of transformer-based models like BERT and the ability to fine-tune pre-trained AI models have revolutionized the field, offering unprecedented opportunities to enhance the efficient and precise interpretation of medical data across diverse languages and healthcare contexts.

OBJECTIVE

This literature review aimed to analyze recent NLP approaches for medical text processing, examining techniques, performance metrics, and advancements across different languages and healthcare contexts.

METHOD

Following the Preferred Reporting Items for Systematic Reviews and Meta Analyses (PRISMA) methodology, a scoping search was conducted in Scopus and PubMed databases, focusing on studies published between 2019-2024. The review included studies on language model fine-tuning and information extraction in healthcare, with a specific search query designed to capture relevant NLP techniques.

RESULTS

Of 67 initial records, 31 studies were ultimately included. Bidirectional Encoder Representations from Transformers (BERT)-based approaches, neural networks, and CRF/LSTM techniques dominated, consistently achieving F1-scores above 85 %. The studies covered multiple languages, with 51.5 % in English, 27.3 % in Chinese, and smaller representations in Italian, German, and Spanish. Hybrid approaches and techniques addressing data privacy and limited labeled data were notably prevalent.

CONCLUSIONS

The review revealed that modern NLP techniques, particularly BERT-based models and hybrid approaches, show significant promise in medical text processing across different languages. While challenges remain in cross-lingual adaptation and data availability, these technologies demonstrate potential to enhance medical data interpretation and analysis.

摘要

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验