• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

重访大语言模型时代的关系抽取

Revisiting Relation Extraction in the era of Large Language Models.

作者信息

Wadhwa Somin, Amir Silvio, Wallace Byron C

机构信息

Northeastern University.

出版信息

Proc Conf Assoc Comput Linguist Meet. 2023 Jul;2023:15566-15589. doi: 10.18653/v1/2023.acl-long.868.

DOI:10.18653/v1/2023.acl-long.868
PMID:37674787
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10482322/
Abstract

Relation extraction (RE) is the core NLP task of inferring semantic relationships between entities from text. Standard supervised RE techniques entail training modules to tag tokens comprising entity spans and then predict the relationship between them. Recent work has instead treated the problem as a task, linearizing relations between entities as target strings to be generated conditioned on the input. Here we push the limits of this approach, using larger language models (GPT-3 and Flan-T5 large) than considered in prior work and evaluating their performance on standard RE tasks under varying levels of supervision. We address issues inherent to evaluating generative approaches to RE by doing human evaluations, in lieu of relying on exact matching. Under this refined evaluation, we find that: (1) prompting with GPT-3 achieves near SOTA performance, i.e., roughly equivalent to existing models; (2) Flan-T5 is not as capable in the few-shot setting, but supervising and fine-tuning it with Chain-of-Thought (CoT) style explanations (generated via GPT-3) yields SOTA results. We release this model as a new baseline for RE tasks.

摘要

关系抽取(RE)是自然语言处理(NLP)的核心任务,即从文本中推断实体之间的语义关系。标准的监督式关系抽取技术需要训练模块对构成实体跨度的词元进行标记,然后预测它们之间的关系。相反,最近的工作将该问题视为一项任务,将实体之间的关系线性化为根据输入生成的目标字符串。在这里,我们拓展了这种方法的极限,使用了比先前工作中考虑的更大的语言模型(GPT-3和Flan-T5 large),并在不同监督水平下评估它们在标准关系抽取任务上的性能。我们通过进行人工评估来解决评估关系抽取生成方法所固有的问题,而不是依赖于精确匹配。在这种精细的评估下,我们发现:(1)使用GPT-3进行提示可实现接近最优的性能,即大致等同于现有模型;(2)Flan-T5在少样本设置中能力较弱,但使用思维链(CoT)风格的解释(通过GPT-3生成)对其进行监督和微调可产生最优结果。我们将此模型作为关系抽取任务的新基线发布。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7034/10482322/4a8dda60e124/nihms-1912166-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7034/10482322/8039f5ec0102/nihms-1912166-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7034/10482322/1708e9b037d7/nihms-1912166-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7034/10482322/b3708d1ede71/nihms-1912166-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7034/10482322/4a8dda60e124/nihms-1912166-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7034/10482322/8039f5ec0102/nihms-1912166-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7034/10482322/1708e9b037d7/nihms-1912166-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7034/10482322/b3708d1ede71/nihms-1912166-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7034/10482322/4a8dda60e124/nihms-1912166-f0003.jpg

相似文献

1
Revisiting Relation Extraction in the era of Large Language Models.重访大语言模型时代的关系抽取
Proc Conf Assoc Comput Linguist Meet. 2023 Jul;2023:15566-15589. doi: 10.18653/v1/2023.acl-long.868.
2
An Empirical Evaluation of Prompting Strategies for Large Language Models in Zero-Shot Clinical Natural Language Processing: Algorithm Development and Validation Study.零样本临床自然语言处理中大型语言模型提示策略的实证评估:算法开发与验证研究
JMIR Med Inform. 2024 Apr 8;12:e55318. doi: 10.2196/55318.
3
Leveraging GPT-4 for identifying cancer phenotypes in electronic health records: a performance comparison between GPT-4, GPT-3.5-turbo, Flan-T5, Llama-3-8B, and spaCy's rule-based and machine learning-based methods.利用GPT-4在电子健康记录中识别癌症表型:GPT-4、GPT-3.5-turbo、Flan-T5、Llama-3-8B与spaCy基于规则和基于机器学习的方法之间的性能比较。
JAMIA Open. 2024 Jul 3;7(3):ooae060. doi: 10.1093/jamiaopen/ooae060. eCollection 2024 Oct.
4
Leveraging GPT-4 for Identifying Cancer Phenotypes in Electronic Health Records: A Performance Comparison between GPT-4, GPT-3.5-turbo, Flan-T5 and spaCy's Rule-based & Machine Learning-based methods.利用GPT-4在电子健康记录中识别癌症表型:GPT-4、GPT-3.5-turbo、Flan-T5与spaCy基于规则和基于机器学习的方法之间的性能比较。
bioRxiv. 2024 Apr 6:2023.09.27.559788. doi: 10.1101/2023.09.27.559788.
5
A Study of Biomedical Relation Extraction Using GPT Models.一项使用GPT模型进行生物医学关系提取的研究。
AMIA Jt Summits Transl Sci Proc. 2024 May 31;2024:391-400. eCollection 2024.
6
GPT-4 as an X data annotator: Unraveling its performance on a stance classification task.GPT-4 作为 X 数据标注员:在立场分类任务中表现如何。
PLoS One. 2024 Aug 15;19(8):e0307741. doi: 10.1371/journal.pone.0307741. eCollection 2024.
7
A comparison of chain-of-thought reasoning strategies across datasets and models.跨数据集和模型的思维链推理策略比较。
PeerJ Comput Sci. 2024 Apr 30;10:e1999. doi: 10.7717/peerj-cs.1999. eCollection 2024.
8
Extraction of semantic biomedical relations from text using conditional random fields.使用条件随机场从文本中提取语义生物医学关系。
BMC Bioinformatics. 2008 Apr 23;9:207. doi: 10.1186/1471-2105-9-207.
9
BertSRC: transformer-based semantic relation classification.BertSRC:基于转换器的语义关系分类。
BMC Med Inform Decis Mak. 2022 Sep 6;22(1):234. doi: 10.1186/s12911-022-01977-5.
10
Emergent analogical reasoning in large language models.大语言模型中的紧急类比推理。
Nat Hum Behav. 2023 Sep;7(9):1526-1541. doi: 10.1038/s41562-023-01659-w. Epub 2023 Jul 31.

引用本文的文献

1
Large language models can extract metadata for annotation of human neuroimaging publications.大型语言模型可以提取元数据,用于注释人类神经影像学术出版物。
Front Neuroinform. 2025 Aug 20;19:1609077. doi: 10.3389/fninf.2025.1609077. eCollection 2025.
2
Reduction of supervision for biomedical knowledge discovery.减少对生物医学知识发现的监督。
BMC Bioinformatics. 2025 Sep 1;26(1):225. doi: 10.1186/s12859-025-06187-0.
3
How important is domain-specific language model pretraining and instruction finetuning for biomedical relation extraction?

本文引用的文献

1
Development of a benchmark corpus to support the automatic extraction of drug-related adverse effects from medical case reports.开发一个基准语料库,以支持从医疗病例报告中自动提取与药物相关的不良反应。
J Biomed Inform. 2012 Oct;45(5):885-92. doi: 10.1016/j.jbi.2012.04.008. Epub 2012 Apr 25.
2
ChemProt: a disease chemical biology database.ChemProt:一个疾病化学生物学数据库。
Nucleic Acids Res. 2011 Jan;39(Database issue):D367-72. doi: 10.1093/nar/gkq906. Epub 2010 Oct 8.
特定领域语言模型预训练和指令微调对生物医学关系抽取有多重要?
Nat Lang Process Inf Syst. 2026;15836:80-94. doi: 10.1007/978-3-031-97141-9_6. Epub 2025 Jul 1.
4
Large Language Models Can Extract Metadata for Annotation of Human Neuroimaging Publications.大语言模型可以提取元数据用于人类神经影像出版物的注释。
bioRxiv. 2025 May 14:2025.05.13.653828. doi: 10.1101/2025.05.13.653828.
5
ERNIE-UIE: Advancing information extraction in Chinese medical knowledge graph.ERNIE-UIE:推进中文医学知识图谱中的信息提取
PLoS One. 2025 May 29;20(5):e0325082. doi: 10.1371/journal.pone.0325082. eCollection 2025.
6
Harnessing the Power of Large Language Models (LLMs) to Unravel the Influence of Genes and Medications on Biological Processes of Wound Healing.利用大语言模型(LLMs)的力量来揭示基因和药物对伤口愈合生物学过程的影响。
AMIA Annu Symp Proc. 2025 May 22;2024:571-580. eCollection 2024.
7
Enhancing Relation Extraction for COVID-19 Vaccine Shot-Adverse Event Associations with Large Language Models.利用大语言模型增强新冠疫苗接种与不良事件关联的关系抽取
Res Sq. 2025 Mar 17:rs.3.rs-6201919. doi: 10.21203/rs.3.rs-6201919/v1.
8
LLM-IE: a python package for biomedical generative information extraction with large language models.LLM-IE:一个用于通过大语言模型进行生物医学生成式信息提取的Python包。
JAMIA Open. 2025 Mar 12;8(2):ooaf012. doi: 10.1093/jamiaopen/ooaf012. eCollection 2025 Apr.
9
Approaches for extracting daily dosage from free-text prescription signatures in heart failure with reduced ejection fraction: a comparative study.从射血分数降低的心力衰竭患者的自由文本处方签名中提取每日剂量的方法:一项比较研究。
JAMIA Open. 2025 Jan 3;8(1):ooae153. doi: 10.1093/jamiaopen/ooae153. eCollection 2025 Feb.
10
Performance and Reproducibility of Large Language Models in Named Entity Recognition: Considerations for the Use in Controlled Environments.大型语言模型在命名实体识别中的性能与可重复性:在受控环境中使用的考量
Drug Saf. 2025 Mar;48(3):287-303. doi: 10.1007/s40264-024-01499-1. Epub 2024 Dec 11.