• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

逻辑:大型语言模型生成的用于小语言模型在立场检测中进行内部认知改进的指导。

LOGIC: LLM-originated guidance for internal cognitive improvement of small language models in stance detection.

作者信息

Lee Woojin, Lee Jaewook, Kim Harksoo

机构信息

Department of Artificial Intelligence, Konkuk University, Seoul, Republic of South Korea.

Department of Computer Science and Engineering, Konkuk University, Seoul, Republic of South Korea.

出版信息

PeerJ Comput Sci. 2024 Dec 3;10:e2585. doi: 10.7717/peerj-cs.2585. eCollection 2024.

DOI:10.7717/peerj-cs.2585
PMID:39650400
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11623219/
Abstract

Stance detection is a critical task in natural language processing that determines an author's viewpoint toward a specific target, playing a pivotal role in social science research and various applications. Traditional approaches incorporating Wikipedia-sourced data into small language models (SLMs) to compensate for limited target knowledge often suffer from inconsistencies in article quality and length due to the diverse pool of Wikipedia contributors. To address these limitations, we utilize large language models (LLMs) pretrained on expansive datasets to generate accurate and contextually relevant target knowledge. By providing concise, real-world insights tailored to the stance detection task, this approach surpasses the limitations of Wikipedia-based information. Despite their superior reasoning capabilities, LLMs are computationally intensive and challenging to deploy on smaller devices. To mitigate these drawbacks, we introduce a reasoning distillation methodology that transfers the reasoning capabilities of LLMs to more compact SLMs, enhancing their efficiency while maintaining robust performance. Our stance detection model, LOGIC (LLM-Originated Guidance for Internal Cognitive improvement of small language models in stance detection), is built on Bidirectional and Auto-Regressive Transformer (BART) and fine-tuned with auxiliary learning tasks, including reasoning distillation. By incorporating LLM-generated target knowledge into the inference process, LOGIC achieves state-of-the-art performance on the VAried Stance Topics (VAST) dataset, outperforming advanced models like GPT-3.5 Turbo and GPT-4 Turbo in stance detection tasks.

摘要

立场检测是自然语言处理中的一项关键任务,它能确定作者对特定目标的观点,在社会科学研究和各种应用中发挥着关键作用。传统方法将源自维基百科的数据纳入小型语言模型(SLM)以弥补目标知识的局限性,但由于维基百科贡献者群体的多样性,往往会在文章质量和长度方面存在不一致性。为了解决这些局限性,我们利用在大规模数据集上预训练的大型语言模型(LLM)来生成准确且与上下文相关的目标知识。通过提供针对立场检测任务量身定制的简洁、真实世界的见解,这种方法超越了基于维基百科信息的局限性。尽管大型语言模型具有卓越的推理能力,但它们计算量很大,在较小设备上部署具有挑战性。为了减轻这些缺点,我们引入了一种推理提炼方法,将大型语言模型的推理能力转移到更紧凑的小型语言模型上,提高其效率同时保持强大性能。我们的立场检测模型LOGIC(立场检测中用于小型语言模型内部认知改进的源自大型语言模型的指导)基于双向自回归Transformer(BART)构建,并通过包括推理提炼在内的辅助学习任务进行微调。通过将大型语言模型生成的目标知识纳入推理过程,LOGIC在多样立场主题(VAST)数据集上取得了领先的性能,在立场检测任务中优于GPT-3.5 Turbo和GPT-4 Turbo等先进模型。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1595/11623219/2e4c9809cf5e/peerj-cs-10-2585-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1595/11623219/6c0b74a0b312/peerj-cs-10-2585-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1595/11623219/c4b4c4c7bdd8/peerj-cs-10-2585-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1595/11623219/89d8250beec5/peerj-cs-10-2585-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1595/11623219/698585a6eec7/peerj-cs-10-2585-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1595/11623219/2e4c9809cf5e/peerj-cs-10-2585-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1595/11623219/6c0b74a0b312/peerj-cs-10-2585-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1595/11623219/c4b4c4c7bdd8/peerj-cs-10-2585-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1595/11623219/89d8250beec5/peerj-cs-10-2585-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1595/11623219/698585a6eec7/peerj-cs-10-2585-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1595/11623219/2e4c9809cf5e/peerj-cs-10-2585-g005.jpg

相似文献

1
LOGIC: LLM-originated guidance for internal cognitive improvement of small language models in stance detection.逻辑:大型语言模型生成的用于小语言模型在立场检测中进行内部认知改进的指导。
PeerJ Comput Sci. 2024 Dec 3;10:e2585. doi: 10.7717/peerj-cs.2585. eCollection 2024.
2
A dataset and benchmark for hospital course summarization with adapted large language models.一个用于医院病程总结的数据集和基准测试,采用了适配的大语言模型。
J Am Med Inform Assoc. 2025 Mar 1;32(3):470-479. doi: 10.1093/jamia/ocae312.
3
Virtual Patients Using Large Language Models: Scalable, Contextualized Simulation of Clinician-Patient Dialogue With Feedback.使用大语言模型的虚拟患者:具有反馈功能的临床医生-患者对话的可扩展、情境化模拟
J Med Internet Res. 2025 Apr 4;27:e68486. doi: 10.2196/68486.
4
Automated Radiology Report Labeling in Chest X-Ray Pathologies: Development and Evaluation of a Large Language Model Framework.胸部X光病理学中的自动放射学报告标注:大语言模型框架的开发与评估
JMIR Med Inform. 2025 Mar 28;13:e68618. doi: 10.2196/68618.
5
Large Language Models in Worldwide Medical Exams: Platform Development and Comprehensive Analysis.全球医学考试中的大语言模型:平台开发与综合分析
J Med Internet Res. 2024 Dec 27;26:e66114. doi: 10.2196/66114.
6
Optimizing biomedical information retrieval with a keyword frequency-driven prompt enhancement strategy.基于关键词频率驱动的提示增强策略优化生物医学信息检索
BMC Bioinformatics. 2024 Aug 27;25(1):281. doi: 10.1186/s12859-024-05902-7.
7
Large Language Model Enhanced Logic Tensor Network for Stance Detection.用于立场检测的大语言模型增强逻辑张量网络
Neural Netw. 2025 Mar;183:106956. doi: 10.1016/j.neunet.2024.106956. Epub 2024 Nov 29.
8
Evaluation of Large Language Models in Tailoring Educational Content for Cancer Survivors and Their Caregivers: Quality Analysis.大型语言模型在为癌症幸存者及其护理人员量身定制教育内容方面的评估:质量分析
JMIR Cancer. 2025 Apr 7;11:e67914. doi: 10.2196/67914.
9
Leveraging Medical Knowledge Graphs Into Large Language Models for Diagnosis Prediction: Design and Application Study.将医学知识图谱融入大语言模型进行诊断预测:设计与应用研究
JMIR AI. 2025 Feb 24;4:e58670. doi: 10.2196/58670.
10
Exploring the reversal curse and other deductive logical reasoning in BERT and GPT-based large language models.探索基于BERT和GPT的大型语言模型中的反转诅咒及其他演绎逻辑推理。
Patterns (N Y). 2024 Jul 25;5(9):101030. doi: 10.1016/j.patter.2024.101030. eCollection 2024 Sep 13.

本文引用的文献

1
Performance of ChatGPT on USMLE: Potential for AI-assisted medical education using large language models.ChatGPT在美国医师执照考试中的表现:使用大语言模型进行人工智能辅助医学教育的潜力。
PLOS Digit Health. 2023 Feb 9;2(2):e0000198. doi: 10.1371/journal.pdig.0000198. eCollection 2023 Feb.