文献检索文档翻译深度研究
Suppr Zotero 插件Zotero 插件
邀请有礼套餐&价格历史记录

新学期,新优惠

限时优惠:9月1日-9月22日

30天高级会员仅需29元

1天体验卡首发特惠仅需5.99元

了解详情
不再提醒
插件&应用
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
高级版
套餐订阅购买积分包
AI 工具
文献检索文档翻译深度研究
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2025

医学文本的少样本学习:进展、趋势和机遇综述。

Few-shot learning for medical text: A review of advances, trends, and opportunities.

机构信息

Department of Biomedical Informatics, School of Medicine, Emory University, Atlanta, GA, United States of America.

Department of Biomedical Informatics, Vanderbilt University Medical Center, Vanderbilt University, Nashville, TN, United States of America.

出版信息

J Biomed Inform. 2023 Aug;144:104458. doi: 10.1016/j.jbi.2023.104458. Epub 2023 Jul 23.


DOI:10.1016/j.jbi.2023.104458
PMID:37488023
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10940971/
Abstract

BACKGROUND: Few-shot learning (FSL) is a class of machine learning methods that require small numbers of labeled instances for training. With many medical topics having limited annotated text-based data in practical settings, FSL-based natural language processing (NLP) holds substantial promise. We aimed to conduct a review to explore the current state of FSL methods for medical NLP. METHODS: We searched for articles published between January 2016 and October 2022 using PubMed/Medline, Embase, ACL Anthology, and IEEE Xplore Digital Library. We also searched the preprint servers (e.g., arXiv, medRxiv, and bioRxiv) via Google Scholar to identify the latest relevant methods. We included all articles that involved FSL and any form of medical text. We abstracted articles based on the data source, target task, training set size, primary method(s)/approach(es), and evaluation metric(s). RESULTS: Fifty-one articles met our inclusion criteria-all published after 2018, and most since 2020 (42/51; 82%). Concept extraction/named entity recognition was the most frequently addressed task (21/51; 41%), followed by text classification (16/51; 31%). Thirty-two (61%) articles reconstructed existing datasets to fit few-shot scenarios, and MIMIC-III was the most frequently used dataset (10/51; 20%). 77% of the articles attempted to incorporate prior knowledge to augment the small datasets available for training. Common methods included FSL with attention mechanisms (20/51; 39%), prototypical networks (11/51; 22%), meta-learning (7/51; 14%), and prompt-based learning methods, the latter being particularly popular since 2021. Benchmarking experiments demonstrated relative underperformance of FSL methods on biomedical NLP tasks. CONCLUSION: Despite the potential for FSL in biomedical NLP, progress has been limited. This may be attributed to the rarity of specialized data, lack of standardized evaluation criteria, and the underperformance of FSL methods on biomedical topics. The creation of publicly-available specialized datasets for biomedical FSL may aid method development by facilitating comparative analyses.

摘要

背景:小样本学习(FSL)是一类机器学习方法,仅需少量有标签的实例进行训练。在实际情况下,许多医学主题的基于文本的注释数据有限,因此基于 FSL 的自然语言处理(NLP)具有很大的潜力。我们旨在进行一项综述,以探索医学 NLP 中 FSL 方法的现状。

方法:我们使用 PubMed/Medline、Embase、ACL 文集和 IEEE Xplore 数字图书馆,搜索了 2016 年 1 月至 2022 年 10 月期间发表的文章。我们还通过 Google Scholar 搜索预印本服务器(例如 arXiv、medRxiv 和 bioRxiv),以确定最新的相关方法。我们纳入了所有涉及 FSL 和任何形式的医学文本的文章。我们根据数据源、目标任务、训练集大小、主要方法/方法和评估指标来摘要文章。

结果:符合纳入标准的文章有 51 篇-均发表于 2018 年以后,其中大多数(42/51;82%)发表于 2020 年以后。概念提取/命名实体识别是最常被研究的任务(21/51;41%),其次是文本分类(16/51;31%)。32 篇(61%)文章重建了现有的数据集以适应小样本场景,其中 MIMIC-III 是最常被使用的数据集(10/51;20%)。77%的文章试图利用先验知识来扩充用于训练的小数据集。常见的方法包括具有注意力机制的 FSL(20/51;39%)、原型网络(11/51;22%)、元学习(7/51;14%)和基于提示的学习方法,后者自 2021 年以来特别流行。基准实验表明,FSL 方法在生物医学 NLP 任务中的表现相对较差。

结论:尽管 FSL 在生物医学 NLP 中有潜力,但进展有限。这可能归因于特殊数据的稀有性、缺乏标准化的评估标准以及 FSL 方法在生物医学主题上的表现不佳。为生物医学 FSL 创建公共可用的特殊数据集可能有助于方法开发,促进比较分析。

相似文献

[1]
Few-shot learning for medical text: A review of advances, trends, and opportunities.

J Biomed Inform. 2023-8

[2]
A comparison of few-shot and traditional named entity recognition models for medical text.

Proc (IEEE Int Conf Healthc Inform). 2022-6

[3]
Data Augmentation with Nearest Neighbor Classifier for Few-Shot Named Entity Recognition.

Stud Health Technol Inform. 2024-1-25

[4]
Few-Shot Learning for Clinical Natural Language Processing Using Siamese Neural Networks: Algorithm Development and Validation Study.

JMIR AI. 2023-5-4

[5]
Extracting adverse drug events from clinical Notes: A systematic review of approaches used.

J Biomed Inform. 2024-3

[6]
A comparison of word embeddings for the biomedical natural language processing.

J Biomed Inform. 2018-9-12

[7]
Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.

Cochrane Database Syst Rev. 2022-2-1

[8]
Deep learning in clinical natural language processing: a methodical review.

J Am Med Inform Assoc. 2020-3-1

[9]
An Empirical Evaluation of Prompting Strategies for Large Language Models in Zero-Shot Clinical Natural Language Processing: Algorithm Development and Validation Study.

JMIR Med Inform. 2024-4-8

[10]
Natural language processing algorithms for mapping clinical text fragments onto ontology concepts: a systematic review and recommendations for future studies.

J Biomed Semantics. 2020-11-16

引用本文的文献

[1]
Large Language Models for CAD-RADS 2.0 Extraction From Semi-Structured Coronary CT Angiography Reports: A Multi-Institutional Study.

Korean J Radiol. 2025-9

[2]
Scoring Physician Risk Communication in Prostate Cancer Using Large Language Models.

medRxiv. 2025-8-11

[3]
Exploration of 3D Few-Shot Learning Techniques for Classification of Knee Joint Injuries on MR Images.

Diagnostics (Basel). 2025-7-18

[4]
Digital transformation with clinical alerts and personalized care systems in an integrated value based model.

NPJ Digit Med. 2025-7-8

[5]
Dynamic few-shot prompting for clinical note section classification using lightweight, open-source large language models.

J Am Med Inform Assoc. 2025-7-1

[6]
Expert of Experts Verification and Alignment (EVAL) Framework for Large Language Models Safety in Gastroenterology.

NPJ Digit Med. 2025-5-3

[7]
Using Generative Artificial Intelligence in Health Economics and Outcomes Research: A Primer on Techniques and Breakthroughs.

Pharmacoecon Open. 2025-4-29

[8]
A simplified retriever to improve accuracy of phenotype normalizations by large language models.

Front Digit Health. 2025-3-4

[9]
Leveraging large language models for knowledge-free weak supervision in clinical natural language processing.

Sci Rep. 2025-3-10

[10]
NLP modeling recommendations for restricted data availability in clinical settings.

BMC Med Inform Decis Mak. 2025-3-7

本文引用的文献

[1]
A comparison of few-shot and traditional named entity recognition models for medical text.

Proc (IEEE Int Conf Healthc Inform). 2022-6

[2]
Trustworthy assertion classification through prompting.

J Biomed Inform. 2022-8

[3]
AT-NeuroEAE: A Joint Extraction Model of Events With Attributes for Research Sharing-Oriented Neuroimaging Provenance Construction.

Front Neurosci. 2022-3-7

[4]
Neuroimaging-ITM: A Text Mining Pipeline Combining Deep Adversarial Learning with Interaction Based Topic Modeling for Enabling the FAIR Neuroimaging Study.

Neuroinformatics. 2022-7

[5]
A novel few-shot learning based multi-modality fusion model for COVID-19 rumor detection from online social media.

PeerJ Comput Sci. 2021-8-20

[6]
Adaptive Prototypical Networks With Label Words and Joint Representation Learning for Few-Shot Relation Classification.

IEEE Trans Neural Netw Learn Syst. 2023-3

[7]
Med7: A transferable clinical natural language processing model for electronic health records.

Artif Intell Med. 2021-8

[8]
Bias and fairness assessment of a natural language processing opioid misuse classifier: detection and mitigation of electronic health record data disadvantages across racial subgroups.

J Am Med Inform Assoc. 2021-10-12

[9]
Few-shot learning creates predictive models of drug response that translate from high-throughput screens to individual patients.

Nat Cancer. 2021-2

[10]
Meta-Learning in Neural Networks: A Survey.

IEEE Trans Pattern Anal Mach Intell. 2022-9

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

推荐工具

医学文档翻译智能文献检索