• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用微调的大语言模型增强语义文本理解:以Quora问题对重复识别为例的研究

Enhancing semantical text understanding with fine-tuned large language models: A case study on Quora Question Pair duplicate identification.

作者信息

Han Sifei, Shi Lingyun, Tsui Fuchiang Rich

机构信息

Department of Biomedical and Health Informatics, Tsui Laboratory, Children's Hospital of Philadelphia, Philadelphia, PA, United States of America.

Department of Anesthesiology and Critical Care, Children's Hospital of Philadelphia, Philadelphia, PA, United States of America.

出版信息

PLoS One. 2025 Jan 10;20(1):e0317042. doi: 10.1371/journal.pone.0317042. eCollection 2025.

DOI:10.1371/journal.pone.0317042
PMID:39792917
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11723592/
Abstract

Semantical text understanding holds significant importance in natural language processing (NLP). Numerous datasets, such as Quora Question Pairs (QQP), have been devised for this purpose. In our previous study, we developed a Siamese Convolutional Neural Network (S-CNN) that achieved an F1 score of 82.02% (95% C.I.: 81.83%-82.20%). Given the growing attention toward large language models (LLMs) like ChatGPT, we aimed to explore their effectiveness in text similarity tasks. In this research, we leveraged 5 pretrained LLMs, conducted various fine-tuning approaches (prompt engineering, n-shot learning, and supervised learning using the low-rank adaptation [LoRA]), and compared their performance using F1 score. To ensure a fair comparison, we followed our previous study's design and dataset by employing a 10-fold cross-validation for supervised model training and evaluation. Additionally, we conducted a secondary study by introducing a recent larger LLM with 70B parameters and comparing it with the 7B model using the GLUE benchmark, and both models were finetuned with the corpus. The fine-tuned LLaMA model with 7B parameters (qLLaMA_LoRA-7B) using 100,000 QQP corpus yielded the best results, achieving an F1 score of 84.9% (95% C.I.: 84.13%-85.67%), which outperformed the Alpaca_LoRA-65B (finetuned based on LLaMA-65B) (F1: 64.98% [64.72%-65.25%]; P<0.01) and had a 3% improvement compared to our previously published best model, S-CNN. The finetuned LLaMA3.1-70B (qLLaMA3.1_LoRA-70B) with 70B parameters (F1: 74.4%) outperformed the qLLaMA_LoRA-7B (F1: 71.9%) using the GLUE benchmark. The study demonstrated an effective LLM finetuning framework, which highlights the importance of finetuning LLMs for improved performance. Our task-specific supervised finetuning demonstrated improved LLM performance compared to larger pretrained models with or without n-shot learning; moreover, finetuning a larger LLM further improved performance compared to finetuning a smaller LLM. Our LLM-based finetuning framework may potentially improve various document similarity tasks, such as matching resumes with job descriptions, recommending subject-matter experts, or identifying potential reviewers for grant proposals or manuscript submissions.

摘要

语义文本理解在自然语言处理(NLP)中具有重要意义。为此,已经设计了许多数据集,如Quora问题对(QQP)。在我们之前的研究中,我们开发了一种连体卷积神经网络(S-CNN),其F1分数达到了82.02%(95%置信区间:81.83%-82.20%)。鉴于对ChatGPT等大型语言模型(LLM)的关注度不断提高,我们旨在探索它们在文本相似性任务中的有效性。在本研究中,我们利用了5个预训练的LLM,采用了各种微调方法(提示工程、n-shot学习以及使用低秩适应[LoRA]的监督学习),并使用F1分数比较它们的性能。为了确保公平比较,我们遵循之前研究的设计和数据集,对监督模型训练和评估采用10折交叉验证。此外,我们进行了一项二次研究,引入了一个最近的具有70B参数的更大的LLM,并使用GLUE基准将其与7B模型进行比较,两个模型都使用语料库进行了微调。使用100,000个QQP语料库对具有7B参数的微调LLaMA模型(qLLaMA_LoRA-7B)产生了最佳结果,F1分数达到84.9%(95%置信区间:84.13%-85.67%),超过了Alpaca_LoRA-65B(基于LLaMA-65B微调)(F1:64.98%[64.72%-65.25%];P<0.01),并且与我们之前发表的最佳模型S-CNN相比有3%的提升。使用GLUE基准,具有70B参数的微调LLaMA3.1-70B(qLLaMA3.1_LoRA-70B)(F1:74.4%)优于qLLaMA_LoRA-7B(F1:71.9%)。该研究展示了一个有效的LLM微调框架,突出了微调LLM以提高性能的重要性。我们特定任务的监督微调表明,与有或没有n-shot学习的更大预训练模型相比,LLM性能有所提高;此外,与微调较小的LLM相比,微调更大的LLM进一步提高了性能。我们基于LLM的微调框架可能潜在地改善各种文档相似性任务,例如将简历与工作描述进行匹配、推荐主题专家或为资助提案或稿件提交识别潜在审稿人。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7e75/11723592/372868532019/pone.0317042.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7e75/11723592/cd932ecdbf78/pone.0317042.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7e75/11723592/372868532019/pone.0317042.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7e75/11723592/cd932ecdbf78/pone.0317042.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7e75/11723592/372868532019/pone.0317042.g002.jpg

相似文献

1
Enhancing semantical text understanding with fine-tuned large language models: A case study on Quora Question Pair duplicate identification.使用微调的大语言模型增强语义文本理解:以Quora问题对重复识别为例的研究
PLoS One. 2025 Jan 10;20(1):e0317042. doi: 10.1371/journal.pone.0317042. eCollection 2025.
2
PH-LLM: Public Health Large Language Models for Infoveillance.PH-LLM:用于信息监测的公共卫生大语言模型
medRxiv. 2025 Feb 10:2025.02.08.25321587. doi: 10.1101/2025.02.08.25321587.
3
Evaluating large language models for health-related text classification tasks with public social media data.利用公共社交媒体数据评估用于健康相关文本分类任务的大型语言模型。
J Am Med Inform Assoc. 2024 Oct 1;31(10):2181-2189. doi: 10.1093/jamia/ocae210.
4
BioInstruct: instruction tuning of large language models for biomedical natural language processing.BioInstruct:用于生物医学自然语言处理的大型语言模型的指令调整。
J Am Med Inform Assoc. 2024 Sep 1;31(9):1821-1832. doi: 10.1093/jamia/ocae122.
5
Improving entity recognition using ensembles of deep learning and fine-tuned large language models: A case study on adverse event extraction from VAERS and social media.使用深度学习集成和微调大语言模型改进实体识别:以从VAERS和社交媒体中提取不良事件为例
J Biomed Inform. 2025 Mar;163:104789. doi: 10.1016/j.jbi.2025.104789. Epub 2025 Feb 7.
6
Mental-LLM: Leveraging Large Language Models for Mental Health Prediction via Online Text Data.心理语言模型:通过在线文本数据利用大语言模型进行心理健康预测。
Proc ACM Interact Mob Wearable Ubiquitous Technol. 2024 Mar;8(1). doi: 10.1145/3643540. Epub 2024 Mar 6.
7
An Empirical Evaluation of Prompting Strategies for Large Language Models in Zero-Shot Clinical Natural Language Processing: Algorithm Development and Validation Study.零样本临床自然语言处理中大型语言模型提示策略的实证评估:算法开发与验证研究
JMIR Med Inform. 2024 Apr 8;12:e55318. doi: 10.2196/55318.
8
An empirical study of LLaMA3 quantization: from LLMs to MLLMs.LLaMA3量化的实证研究:从大语言模型到多模态大语言模型
Vis Intell. 2024;2(1):36. doi: 10.1007/s44267-024-00070-x. Epub 2024 Dec 30.
9
Semantic Clinical Artificial Intelligence vs Native Large Language Model Performance on the USMLE.语义临床人工智能与原生大语言模型在美国医师执照考试中的表现对比
JAMA Netw Open. 2025 Apr 1;8(4):e256359. doi: 10.1001/jamanetworkopen.2025.6359.
10
Advancing entity recognition in biomedicine via instruction tuning of large language models.通过指令调整大型语言模型推进生物医学中的实体识别。
Bioinformatics. 2024 Mar 29;40(4). doi: 10.1093/bioinformatics/btae163.

本文引用的文献

1
BioGPT: generative pre-trained transformer for biomedical text generation and mining.BioGPT:用于生物医学文本生成和挖掘的生成式预训练转换器。
Brief Bioinform. 2022 Nov 19;23(6). doi: 10.1093/bib/bbac409.
2
A disease-specific language representation model for cerebrovascular disease research.一种用于脑血管病研究的疾病特异性语言表示模型。
Comput Methods Programs Biomed. 2021 Nov;211:106446. doi: 10.1016/j.cmpb.2021.106446. Epub 2021 Sep 30.
3
A Survey of Convolutional Neural Networks: Analysis, Applications, and Prospects.卷积神经网络综述:分析、应用与展望
IEEE Trans Neural Netw Learn Syst. 2022 Dec;33(12):6999-7019. doi: 10.1109/TNNLS.2021.3084827. Epub 2022 Nov 30.
4
Siamese Neural Networks: An Overview.暹罗神经网络:概述。
Methods Mol Biol. 2021;2190:73-94. doi: 10.1007/978-1-0716-0826-5_3.
5
BioBERT: a pre-trained biomedical language representation model for biomedical text mining.BioBERT:一种用于生物医学文本挖掘的预训练生物医学语言表示模型。
Bioinformatics. 2020 Feb 15;36(4):1234-1240. doi: 10.1093/bioinformatics/btz682.
6
Convolutional Neural Networks for Biomedical Text Classification: Application in Indexing Biomedical Articles.用于生物医学文本分类的卷积神经网络:在生物医学文章索引中的应用
ACM BCB. 2015 Sep;2015:258-267. doi: 10.1145/2808719.2808746.
7
Rationale-Augmented Convolutional Neural Networks for Text Classification.用于文本分类的基于原理增强的卷积神经网络。
Proc Conf Empir Methods Nat Lang Process. 2016 Nov;2016:795-804. doi: 10.18653/v1/d16-1076.