• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

迁移学习和领域自适应在电子病历自然语言处理中的最新研究综述

A Review of Recent Work in Transfer Learning and Domain Adaptation for Natural Language Processing of Electronic Health Records.

机构信息

School of Information, University of Arizona, Tucson, USA.

Department of Biostatistics and Health Informatics, King's College London, London, United Kingdom.

出版信息

Yearb Med Inform. 2021 Aug;30(1):239-244. doi: 10.1055/s-0041-1726522. Epub 2021 Sep 3.

DOI:10.1055/s-0041-1726522
PMID:34479396
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8416218/
Abstract

OBJECTIVES

We survey recent work in biomedical NLP on building more adaptable or generalizable models, with a focus on work dealing with electronic health record (EHR) texts, to better understand recent trends in this area and identify opportunities for future research.

METHODS

We searched PubMed, the Institute of Electrical and Electronics Engineers (IEEE), the Association for Computational Linguistics (ACL) anthology, the Association for the Advancement of Artificial Intelligence (AAAI) proceedings, and Google Scholar for the years 2018-2020. We reviewed abstracts to identify the most relevant and impactful work, and manually extracted data points from each of these papers to characterize the types of methods and tasks that were studied, in which clinical domains, and current state-of-the-art results.

RESULTS

The ubiquity of pre-trained transformers in clinical NLP research has contributed to an increase in domain adaptation and generalization-focused work that uses these models as the key component. Most recently, work has started to train biomedical transformers and to extend the fine-tuning process with additional domain adaptation techniques. We also highlight recent research in cross-lingual adaptation, as a special case of adaptation.

CONCLUSIONS

While pre-trained transformer models have led to some large performance improvements, general domain pre-training does not always transfer adequately to the clinical domain due to its highly specialized language. There is also much work to be done in showing that the gains obtained by pre-trained transformers are beneficial in real world use cases. The amount of work in domain adaptation and transfer learning is limited by dataset availability and creating datasets for new domains is challenging. The growing body of research in languages other than English is encouraging, and more collaboration between researchers across the language divide would likely accelerate progress in non-English clinical NLP.

摘要

目的

我们调查了生物医学自然语言处理领域中最近关于构建更具适应性或可泛化模型的工作,重点是处理电子健康记录 (EHR) 文本的工作,以更好地了解该领域的最新趋势并确定未来研究的机会。

方法

我们在 2018 年至 2020 年期间在 PubMed、电气和电子工程师协会 (IEEE)、计算语言学协会 (ACL) 文集、人工智能促进协会 (AAAI) 会议和 Google Scholar 上进行了搜索。我们查阅了摘要,以确定最相关和最有影响力的工作,并从这些论文中的每一篇中手动提取数据点,以描述所研究的方法和任务的类型、临床领域以及当前的最新技术水平。

结果

预训练的转换器在临床 NLP 研究中的普及导致了更多关注领域适应和泛化的工作,这些工作将这些模型作为关键组成部分。最近,已经开始训练生物医学转换器,并通过额外的领域适应技术扩展微调过程。我们还强调了最近在跨语言适应方面的研究,这是适应的一个特殊情况。

结论

虽然预训练的转换器模型已经取得了一些性能的大幅提高,但由于其高度专业化的语言,通用领域的预训练并不总是能够充分转移到临床领域。在展示通过预训练的转换器获得的收益在实际用例中是有益的方面,还有很多工作要做。领域适应和迁移学习的工作数量受到数据集可用性的限制,为新领域创建数据集具有挑战性。其他语言(不仅仅是英语)的研究工作越来越多,语言障碍之外的研究人员之间的更多合作可能会加速非英语临床 NLP 的进展。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/78d1/8416218/8c04a023a0c9/10-1055-s-0041-1726522-ilaparra-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/78d1/8416218/c0d08668c0da/10-1055-s-0041-1726522-ilaparra-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/78d1/8416218/8c04a023a0c9/10-1055-s-0041-1726522-ilaparra-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/78d1/8416218/c0d08668c0da/10-1055-s-0041-1726522-ilaparra-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/78d1/8416218/8c04a023a0c9/10-1055-s-0041-1726522-ilaparra-2.jpg

相似文献

1
A Review of Recent Work in Transfer Learning and Domain Adaptation for Natural Language Processing of Electronic Health Records.迁移学习和领域自适应在电子病历自然语言处理中的最新研究综述
Yearb Med Inform. 2021 Aug;30(1):239-244. doi: 10.1055/s-0041-1726522. Epub 2021 Sep 3.
2
A comparison of word embeddings for the biomedical natural language processing.生物医学自然语言处理中词嵌入的比较。
J Biomed Inform. 2018 Nov;87:12-20. doi: 10.1016/j.jbi.2018.09.008. Epub 2018 Sep 12.
3
The Growing Impact of Natural Language Processing in Healthcare and Public Health.自然语言处理在医疗保健和公共卫生领域的影响日益扩大。
Inquiry. 2024 Jan-Dec;61:469580241290095. doi: 10.1177/00469580241290095.
4
Applying Natural Language Processing to Textual Data From Clinical Data Warehouses: Systematic Review.将自然语言处理应用于临床数据仓库中的文本数据:系统评价。
JMIR Med Inform. 2023 Dec 15;11:e42477. doi: 10.2196/42477.
5
Does BERT need domain adaptation for clinical negation detection?BERT 是否需要进行领域适应来进行临床否定检测?
J Am Med Inform Assoc. 2020 Apr 1;27(4):584-591. doi: 10.1093/jamia/ocaa001.
6
Benchmarking for biomedical natural language processing tasks with a domain specific ALBERT.基于领域特定的 ALBERT 进行生物医学自然语言处理任务的基准测试。
BMC Bioinformatics. 2022 Apr 21;23(1):144. doi: 10.1186/s12859-022-04688-w.
7
Expanding the Diversity of Texts and Applications: Findings from the Section on Clinical Natural Language Processing of the International Medical Informatics Association Yearbook.拓展文本与应用的多样性:国际医学信息学协会年鉴临床自然语言处理章节的研究发现
Yearb Med Inform. 2018 Aug;27(1):193-198. doi: 10.1055/s-0038-1667080. Epub 2018 Aug 29.
8
Transformers-sklearn: a toolkit for medical language understanding with transformer-based models.Transformer-sklearn:一个基于 Transformer 的模型的医学语言理解工具包。
BMC Med Inform Decis Mak. 2021 Jul 30;21(Suppl 2):90. doi: 10.1186/s12911-021-01459-0.
9
Exploring the Latest Highlights in Medical Natural Language Processing across Multiple Languages: A Survey.探索多语言医学自然语言处理的最新亮点:综述。
Yearb Med Inform. 2023 Aug;32(1):230-243. doi: 10.1055/s-0043-1768726. Epub 2023 Dec 26.
10
Relation Extraction from Clinical Narratives Using Pre-trained Language Models.使用预训练语言模型从临床叙述中提取关系
AMIA Annu Symp Proc. 2020 Mar 4;2019:1236-1245. eCollection 2019.

引用本文的文献

1
Identifying Adverse Drug Events in Clinical Text Using Fine-Tuned Clinical Language Models: Machine Learning Study.使用微调临床语言模型识别临床文本中的药物不良事件:机器学习研究
JMIR Form Res. 2025 Sep 11;9:e71949. doi: 10.2196/71949.
2
Transfer Learning with Clinical Concept Embeddings from Large Language Models.基于大语言模型临床概念嵌入的迁移学习
AMIA Jt Summits Transl Sci Proc. 2025 Jun 10;2025:167-176. eCollection 2025.
3
Transfer learning assessment of small datasets relating manufacturing parameters with electrochemical energy cell component properties.

本文引用的文献

1
Med-BERT: pretrained contextualized embeddings on large-scale structured electronic health records for disease prediction.医学BERT:基于大规模结构化电子健康记录进行疾病预测的预训练上下文嵌入模型
NPJ Digit Med. 2021 May 20;4(1):86. doi: 10.1038/s41746-021-00455-y.
2
On the Construction of Multilingual Corpora for Clinical Text Mining.关于用于临床文本挖掘的多语言语料库的构建
Stud Health Technol Inform. 2020 Jun 16;270:347-351. doi: 10.3233/SHTI200180.
3
BEHRT: Transformer for Electronic Health Records.BEHRT:电子健康记录的转换器。
将制造参数与电化学能源电池组件特性相关联的小数据集的迁移学习评估。
NPJ Adv Manuf. 2025;2(1):14. doi: 10.1038/s44334-025-00024-1. Epub 2025 Apr 18.
4
The Application of Deep Learning Tools on Medical Reports to Optimize the Input of an Atrial-Fibrillation-Recurrence Predictive Model.深度学习工具在医学报告中的应用以优化心房颤动复发预测模型的输入
J Clin Med. 2025 Mar 27;14(7):2297. doi: 10.3390/jcm14072297.
5
Utilizing large language models for gastroenterology research: a conceptual framework.利用大语言模型进行胃肠病学研究:一个概念框架。
Therap Adv Gastroenterol. 2025 Apr 1;18:17562848251328577. doi: 10.1177/17562848251328577. eCollection 2025.
6
Automated System to Capture Patient Symptoms From Multitype Japanese Clinical Texts: Retrospective Study.从多类型日本临床文本中自动捕获患者症状的系统:回顾性研究。
JMIR Med Inform. 2024 Sep 24;12:e58977. doi: 10.2196/58977.
7
Machine learning natural language processing for identifying venous thromboembolism: systematic review and meta-analysis.机器学习自然语言处理在识别静脉血栓栓塞症中的应用:系统评价和荟萃分析。
Blood Adv. 2024 Jun 25;8(12):2991-3000. doi: 10.1182/bloodadvances.2023012200.
8
Exploring the Latest Highlights in Medical Natural Language Processing across Multiple Languages: A Survey.探索多语言医学自然语言处理的最新亮点:综述。
Yearb Med Inform. 2023 Aug;32(1):230-243. doi: 10.1055/s-0043-1768726. Epub 2023 Dec 26.
9
Association of metastatic pattern in breast cancer with tumor and patient-specific factors: a nationwide autopsy study using artificial intelligence.乳腺癌转移模式与肿瘤和患者特征的相关性:一项利用人工智能进行的全国性尸检研究。
Breast Cancer. 2024 Mar;31(2):263-271. doi: 10.1007/s12282-023-01534-6. Epub 2023 Dec 22.
10
Large Language Models in Neurology Research and Future Practice.大语言模型在神经病学研究和未来实践中的应用。
Neurology. 2023 Dec 4;101(23):1058-1067. doi: 10.1212/WNL.0000000000207967.
Sci Rep. 2020 Apr 28;10(1):7155. doi: 10.1038/s41598-020-62922-y.
4
Research on Chinese medical named entity recognition based on collaborative cooperation of multiple neural network models.基于多神经网络模型协同合作的中医命名实体识别研究
J Biomed Inform. 2020 Apr;104:103395. doi: 10.1016/j.jbi.2020.103395. Epub 2020 Feb 25.
5
Does BERT need domain adaptation for clinical negation detection?BERT 是否需要进行领域适应来进行临床否定检测?
J Am Med Inform Assoc. 2020 Apr 1;27(4):584-591. doi: 10.1093/jamia/ocaa001.
6
Named entity recognition in electronic health records using transfer learning bootstrapped Neural Networks.基于迁移学习的神经网络在电子健康记录中的命名实体识别。
Neural Netw. 2020 Jan;121:132-139. doi: 10.1016/j.neunet.2019.08.032. Epub 2019 Sep 6.
7
BioBERT: a pre-trained biomedical language representation model for biomedical text mining.BioBERT:一种用于生物医学文本挖掘的预训练生物医学语言表示模型。
Bioinformatics. 2020 Feb 15;36(4):1234-1240. doi: 10.1093/bioinformatics/btz682.
8
Enhancing clinical concept extraction with contextual embeddings.利用上下文嵌入增强临床概念提取。
J Am Med Inform Assoc. 2019 Nov 1;26(11):1297-1304. doi: 10.1093/jamia/ocz096.
9
Multitask learning and benchmarking with clinical time series data.多任务学习与临床时间序列数据的基准测试。
Sci Data. 2019 Jun 17;6(1):96. doi: 10.1038/s41597-019-0103-9.
10
Neural transfer learning for assigning diagnosis codes to EMRs.将诊断编码分配给电子病历的神经迁移学习。
Artif Intell Med. 2019 May;96:116-122. doi: 10.1016/j.artmed.2019.04.002. Epub 2019 Apr 12.