使用暹罗神经网络的临床自然语言处理少样本学习：算法开发与验证研究

Few-Shot Learning for Clinical Natural Language Processing Using Siamese Neural Networks: Algorithm Development and Validation Study.

作者信息

Oniani David, Chandrasekar Premkumar, Sivarajkumar Sonish, Wang Yanshan

机构信息

Department of Health Information Management, University of Pittsburgh, Pittsburgh, PA, United States.

Intelligent Systems Program, University of Pittsburgh, Pittsburgh, PA, United States.

出版信息

JMIR AI. 2023 May 4;2:e44293. doi: 10.2196/44293.

DOI:10.2196/44293

PMID:38875537

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11041484/

Abstract

BACKGROUND

Natural language processing (NLP) has become an emerging technology in health care that leverages a large amount of free-text data in electronic health records to improve patient care, support clinical decisions, and facilitate clinical and translational science research. Recently, deep learning has achieved state-of-the-art performance in many clinical NLP tasks. However, training deep learning models often requires large, annotated data sets, which are normally not publicly available and can be time-consuming to build in clinical domains. Working with smaller annotated data sets is typical in clinical NLP; therefore, ensuring that deep learning models perform well is crucial for real-world clinical NLP applications. A widely adopted approach is fine-tuning existing pretrained language models, but these attempts fall short when the training data set contains only a few annotated samples. Few-shot learning (FSL) has recently been investigated to tackle this problem. Siamese neural network (SNN) has been widely used as an FSL approach in computer vision but has not been studied well in NLP. Furthermore, the literature on its applications in clinical domains is scarce.

OBJECTIVE

The aim of our study is to propose and evaluate SNN-based approaches for few-shot clinical NLP tasks.

METHODS

We propose 2 SNN-based FSL approaches, including pretrained SNN and SNN with second-order embeddings. We evaluate the proposed approaches on the clinical sentence classification task. We experiment with 3 few-shot settings, including 4-shot, 8-shot, and 16-shot learning. The clinical NLP task is benchmarked using the following 4 pretrained language models: bidirectional encoder representations from transformers (BERT), BERT for biomedical text mining (BioBERT), BioBERT trained on clinical notes (BioClinicalBERT), and generative pretrained transformer 2 (GPT-2). We also present a performance comparison between SNN-based approaches and the prompt-based GPT-2 approach.

RESULTS

In 4-shot sentence classification tasks, GPT-2 had the highest precision (0.63), but its recall (0.38) and F score (0.42) were lower than those of BioBERT-based pretrained SNN (0.45 and 0.46, respectively). In both 8-shot and 16-shot settings, SNN-based approaches outperformed GPT-2 in all 3 metrics of precision, recall, and F score.

CONCLUSIONS

The experimental results verified the effectiveness of the proposed SNN approaches for few-shot clinical NLP tasks.

摘要

背景

自然语言处理（NLP）已成为医疗保健领域的一项新兴技术，它利用电子健康记录中的大量自由文本数据来改善患者护理、支持临床决策并促进临床和转化科学研究。最近，深度学习在许多临床NLP任务中取得了领先的性能。然而，训练深度学习模型通常需要大量带注释的数据集，这些数据集通常不公开，并且在临床领域构建起来可能很耗时。在临床NLP中，使用较小的带注释数据集是很常见的；因此，确保深度学习模型表现良好对于实际临床NLP应用至关重要。一种广泛采用的方法是对现有的预训练语言模型进行微调，但当训练数据集只包含少量带注释的样本时，这些尝试效果不佳。最近，少样本学习（FSL）已被研究用于解决这个问题。暹罗神经网络（SNN）在计算机视觉中已被广泛用作一种FSL方法，但在NLP中尚未得到充分研究。此外，关于其在临床领域应用的文献也很稀少。

目的

我们研究的目的是提出并评估基于SNN的少样本临床NLP任务方法。

方法

我们提出了2种基于SNN的FSL方法，包括预训练SNN和具有二阶嵌入的SNN。我们在临床句子分类任务上评估所提出的方法。我们在3种少样本设置下进行实验，包括4样本、8样本和16样本学习。临床NLP任务使用以下4种预训练语言模型进行基准测试：来自变换器的双向编码器表示（BERT）、用于生物医学文本挖掘的BERT（BioBERT）、在临床笔记上训练的BioBERT（BioClinicalBERT）和生成式预训练变换器2（GPT-2）。我们还展示了基于SNN的方法与基于提示的GPT-2方法之间的性能比较。

结果

在4样本句子分类任务中，GPT-2的精度最高（0.63），但其召回率（0.38）和F分数（0.42）低于基于BioBERT的预训练SNN（分别为0.45和0.46）。在8样本和16样本设置下，基于SNN的方法在精度、召回率和F分数这3个指标上均优于GPT-2。

结论

实验结果验证了所提出的基于SNN的方法在少样本临床NLP任务中的有效性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1dc7/11041484/767c42dddc02/ai_v2i1e44293_fig1.jpg

相似文献

Few-Shot Learning for Clinical Natural Language Processing Using Siamese Neural Networks: Algorithm Development and Validation Study.

JMIR AI. 2023 May 4;2:e44293. doi: 10.2196/44293.

An Empirical Evaluation of Prompting Strategies for Large Language Models in Zero-Shot Clinical Natural Language Processing: Algorithm Development and Validation Study.

JMIR Med Inform. 2024 Apr 8;12:e55318. doi: 10.2196/55318.

A large language model-based generative natural language processing framework fine-tuned on clinical notes accurately extracts headache frequency from electronic health records.

Headache. 2024 Apr;64(4):400-409. doi: 10.1111/head.14702. Epub 2024 Mar 25.

Identification of Semantically Similar Sentences in Clinical Notes: Iterative Intermediate Training Using Multi-Task Learning.

JMIR Med Inform. 2020 Nov 27;8(11):e22508. doi: 10.2196/22508.

A Large Language Model-Based Generative Natural Language Processing Framework Finetuned on Clinical Notes Accurately Extracts Headache Frequency from Electronic Health Records.

medRxiv. 2023 Oct 3:2023.10.02.23296403. doi: 10.1101/2023.10.02.23296403.

When BERT meets Bilbo: a learning curve analysis of pretrained language model on disease classification.

BMC Med Inform Decis Mak. 2022 Apr 5;21(Suppl 9):377. doi: 10.1186/s12911-022-01829-2.

Extracting comprehensive clinical information for breast cancer using deep learning methods.

Int J Med Inform. 2019 Dec;132:103985. doi: 10.1016/j.ijmedinf.2019.103985. Epub 2019 Oct 2.

Relation Classification for Bleeding Events From Electronic Health Records Using Deep Learning Systems: An Empirical Study.

JMIR Med Inform. 2021 Jul 2;9(7):e27527. doi: 10.2196/27527.

BioBERT: a pre-trained biomedical language representation model for biomedical text mining.

Bioinformatics. 2020 Feb 15;36(4):1234-1240. doi: 10.1093/bioinformatics/btz682.

RadBERT: Adapting Transformer-based Language Models to Radiology.

Radiol Artif Intell. 2022 Jun 15;4(4):e210258. doi: 10.1148/ryai.210258. eCollection 2022 Jul.

引用本文的文献

Prediction of liquid-phase separation proteins using Siamese network with feature fusion.

Brief Bioinform. 2025 Jul 2;26(4). doi: 10.1093/bib/bbaf393.

A novel dual embedding few-shot learning approach for classifying bone loss using orthopantomogram radiographic notes.

Head Face Med. 2025 Jul 11;21(1):49. doi: 10.1186/s13005-025-00528-3.

Digital transformation with clinical alerts and personalized care systems in an integrated value based model.

NPJ Digit Med. 2025 Jul 8;8(1):415. doi: 10.1038/s41746-025-01838-1.

Automated Extraction of Key Entities from Non-English Mammography Reports Using Named Entity Recognition with Prompt Engineering.

Bioengineering (Basel). 2025 Feb 10;12(2):168. doi: 10.3390/bioengineering12020168.

Learning to explain is a good biomedical few-shot learner.

Bioinformatics. 2024 Oct 1;40(10). doi: 10.1093/bioinformatics/btae589.

本文引用的文献

HealthPrompt: A Zero-shot Learning Paradigm for Clinical Natural Language Processing.

AMIA Annu Symp Proc. 2023 Apr 29;2022:972-981. eCollection 2022.

Siamese neural networks for the classification of high-dimensional radiomic features.

Proc SPIE Int Soc Opt Eng. 2020 Feb;11314. doi: 10.1117/12.2549389. Epub 2020 Mar 16.

The unreasonable effectiveness of deep learning in artificial intelligence.

Proc Natl Acad Sci U S A. 2020 Dec 1;117(48):30033-30038. doi: 10.1073/pnas.1907373117. Epub 2020 Jan 28.

BioBERT: a pre-trained biomedical language representation model for biomedical text mining.

Bioinformatics. 2020 Feb 15;36(4):1234-1240. doi: 10.1093/bioinformatics/btz682.

MIMIC-III, a freely accessible critical care database.

Sci Data. 2016 May 24;3:160035. doi: 10.1038/sdata.2016.35.

Representation learning: a review and new perspectives.

IEEE Trans Pattern Anal Mach Intell. 2013 Aug;35(8):1798-828. doi: 10.1109/TPAMI.2013.50.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

使用暹罗神经网络的临床自然语言处理少样本学习：算法开发与验证研究

Few-Shot Learning for Clinical Natural Language Processing Using Siamese Neural Networks: Algorithm Development and Validation Study.

作者信息

Oniani David, Chandrasekar Premkumar, Sivarajkumar Sonish, Wang Yanshan

机构信息

Department of Health Information Management, University of Pittsburgh, Pittsburgh, PA, United States.

Intelligent Systems Program, University of Pittsburgh, Pittsburgh, PA, United States.