Suppr超能文献

使用具有提示工程的大语言模型构建心力衰竭知识图谱。

Knowledge graph construction for heart failure using large language models with prompt engineering.

作者信息

Xu Tianhan, Gu Yixun, Xue Mantian, Gu Renjie, Li Bin, Gu Xiang

机构信息

School of Information Engineering, Yangzhou University, Yangzhou, Jiangsu, China.

School of Information Engineering, Yangzhou Polytechnic Institute, Yangzhou, Jiangsu, China.

出版信息

Front Comput Neurosci. 2024 Jul 2;18:1389475. doi: 10.3389/fncom.2024.1389475. eCollection 2024.

Abstract

INTRODUCTION

Constructing an accurate and comprehensive knowledge graph of specific diseases is critical for practical clinical disease diagnosis and treatment, reasoning and decision support, rehabilitation, and health management. For knowledge graph construction tasks (such as named entity recognition, relation extraction), classical BERT-based methods require a large amount of training data to ensure model performance. However, real-world medical annotation data, especially disease-specific annotation samples, are very limited. In addition, existing models do not perform well in recognizing out-of-distribution entities and relations that are not seen in the training phase.

METHOD

In this study, we present a novel and practical pipeline for constructing a heart failure knowledge graph using large language models and medical expert refinement. We apply prompt engineering to the three phases of schema design: schema design, information extraction, and knowledge completion. The best performance is achieved by designing task-specific prompt templates combined with the TwoStepChat approach.

RESULTS

Experiments on two datasets show that the TwoStepChat method outperforms the Vanillia prompt and outperforms the fine-tuned BERT-based baselines. Moreover, our method saves 65% of the time compared to manual annotation and is better suited to extract the out-of-distribution information in the real world.

摘要

引言

构建特定疾病准确且全面的知识图谱对于临床疾病的实际诊断与治疗、推理与决策支持、康复以及健康管理至关重要。对于知识图谱构建任务(如命名实体识别、关系抽取),基于经典BERT的方法需要大量训练数据来确保模型性能。然而,现实世界中的医学标注数据,尤其是特定疾病的标注样本非常有限。此外,现有模型在识别训练阶段未出现的分布外实体和关系方面表现不佳。

方法

在本研究中,我们提出了一种新颖且实用的流程,利用大语言模型和医学专家优化来构建心力衰竭知识图谱。我们将提示工程应用于模式设计、信息提取和知识完善这三个阶段。通过结合TwoStepChat方法设计特定任务的提示模板,可实现最佳性能。

结果

在两个数据集上的实验表明,TwoStepChat方法优于香草提示,并且优于基于微调BERT的基线方法。此外,与人工标注相比,我们的方法节省了65%的时间,并且更适合提取现实世界中的分布外信息。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ab6f/11250484/3cb8a40e0689/fncom-18-1389475-g0001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验