• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

评估提示工程在知识图谱问答中的有效性。

Evaluating the effectiveness of prompt engineering for knowledge graph question answering.

作者信息

Kosten Catherine, Nooralahzadeh Farhad, Stockinger Kurt

机构信息

School of Engineering, Institute of Computer Science, Intelligent Information Systems Research Group, Zurich University of Applied Sciences, Winterthur, Switzerland.

出版信息

Front Artif Intell. 2025 Jan 13;7:1454258. doi: 10.3389/frai.2024.1454258. eCollection 2024.

DOI:10.3389/frai.2024.1454258
PMID:39871862
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11770024/
Abstract

Many different methods for prompting large language models have been developed since the emergence of OpenAI's ChatGPT in November 2022. In this work, we evaluate six different few-shot prompting methods. The first set of experiments evaluates three frameworks that focus on the quantity or type of shots in a prompt: a baseline method with a simple prompt and a small number of shots, random few-shot prompting with 10, 20, and 30 shots, and similarity-based few-shot prompting. The second set of experiments target optimizing the prompt or enhancing shots through Large Language Model (LLM)-generated explanations, using three prompting frameworks: Explain then Translate, Question Decomposition Meaning Representation, and Optimization by Prompting. We evaluate these six prompting methods on the newly created Spider4SPARQL benchmark, as it is the most complex SPARQL-based Knowledge Graph Question Answering (KGQA) benchmark to date. Across the various prompting frameworks used, the commercial model is unable to achieve a score over 51%, indicating that KGQA, especially for complex queries, with multiple hops, set operations and filters remains a challenging task for LLMs. Our experiments find that the most successful prompting framework for KGQA is a simple prompt combined with an ontology and five random shots.

摘要

自2022年11月OpenAI的ChatGPT出现以来,已经开发了许多不同的方法来提示大语言模型。在这项工作中,我们评估了六种不同的少样本提示方法。第一组实验评估了三个专注于提示中样本数量或类型的框架:一种具有简单提示和少量样本的基线方法、具有10、20和30个样本的随机少样本提示以及基于相似度的少样本提示。第二组实验旨在通过大语言模型(LLM)生成的解释来优化提示或增强样本,使用了三个提示框架:先解释后翻译、问题分解意义表示以及通过提示进行优化。我们在新创建的Spider4SPARQL基准上评估这六种提示方法,因为它是迄今为止最复杂的基于SPARQL的知识图谱问答(KGQA)基准。在使用的各种提示框架中,商业模型无法获得超过51%的分数,这表明KGQA,特别是对于具有多跳、集合操作和过滤器的复杂查询,对大语言模型来说仍然是一项具有挑战性的任务。我们的实验发现,对于KGQA最成功的提示框架是结合本体和五个随机样本的简单提示。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e003/11770024/4459f1d2bbcb/frai-07-1454258-g0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e003/11770024/f8b3195e44d1/frai-07-1454258-g0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e003/11770024/63913884356b/frai-07-1454258-g0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e003/11770024/f3ac05e624bf/frai-07-1454258-g0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e003/11770024/4459f1d2bbcb/frai-07-1454258-g0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e003/11770024/f8b3195e44d1/frai-07-1454258-g0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e003/11770024/63913884356b/frai-07-1454258-g0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e003/11770024/f3ac05e624bf/frai-07-1454258-g0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e003/11770024/4459f1d2bbcb/frai-07-1454258-g0004.jpg

相似文献

1
Evaluating the effectiveness of prompt engineering for knowledge graph question answering.评估提示工程在知识图谱问答中的有效性。
Front Artif Intell. 2025 Jan 13;7:1454258. doi: 10.3389/frai.2024.1454258. eCollection 2024.
2
An Empirical Evaluation of Prompting Strategies for Large Language Models in Zero-Shot Clinical Natural Language Processing: Algorithm Development and Validation Study.零样本临床自然语言处理中大型语言模型提示策略的实证评估:算法开发与验证研究
JMIR Med Inform. 2024 Apr 8;12:e55318. doi: 10.2196/55318.
3
Improving the use of LLMs in radiology through prompt engineering: from precision prompts to zero-shot learning.通过提示工程提高放射科对大语言模型的使用:从精准提示到零样本学习。
Rofo. 2024 Nov;196(11):1166-1170. doi: 10.1055/a-2264-5631. Epub 2024 Feb 26.
4
Knowledge graph-based thought: a knowledge graph-enhanced LLM framework for pan-cancer question answering.基于知识图谱的思考:一种用于泛癌问答的知识图谱增强语言模型框架
Gigascience. 2025 Jan 6;14. doi: 10.1093/gigascience/giae082.
5
Bio-SODA UX: enabling natural language question answering over knowledge graphs with user disambiguation.生物苏打水用户体验:通过用户消歧实现知识图谱上的自然语言问答。
Distrib Parallel Databases. 2022;40(2-3):409-440. doi: 10.1007/s10619-022-07414-w. Epub 2022 Jul 16.
6
Optimizing biomedical information retrieval with a keyword frequency-driven prompt enhancement strategy.基于关键词频率驱动的提示增强策略优化生物医学信息检索
BMC Bioinformatics. 2024 Aug 27;25(1):281. doi: 10.1186/s12859-024-05902-7.
7
Model tuning or prompt Tuning? a study of large language models for clinical concept and relation extraction.模型调优还是提示调优?大型语言模型在临床概念和关系抽取中的应用研究。
J Biomed Inform. 2024 May;153:104630. doi: 10.1016/j.jbi.2024.104630. Epub 2024 Mar 26.
8
Emotional prompting amplifies disinformation generation in AI large language models.情感提示会放大人工智能大语言模型中的虚假信息生成。
Front Artif Intell. 2025 Apr 7;8:1543603. doi: 10.3389/frai.2025.1543603. eCollection 2025.
9
Evaluating the ChatGPT family of models for biomedical reasoning and classification.评估ChatGPT系列模型在生物医学推理和分类方面的表现。
J Am Med Inform Assoc. 2024 Apr 3;31(4):940-948. doi: 10.1093/jamia/ocad256.
10
Learning to Make Rare and Complex Diagnoses With Generative AI Assistance: Qualitative Study of Popular Large Language Models.利用生成式人工智能辅助学习罕见且复杂的诊断:对流行的大型语言模型的定性研究。
JMIR Med Educ. 2024 Feb 13;10:e51391. doi: 10.2196/51391.

本文引用的文献

1
Querying knowledge graphs in natural language.用自然语言查询知识图谱。
J Big Data. 2021;8(1):3. doi: 10.1186/s40537-020-00383-w. Epub 2021 Jan 6.
2
Analysis and best parameters selection for person recognition based on gait model using CNN algorithm and image augmentation.基于卷积神经网络(CNN)算法和图像增强的步态模型的人体识别分析与最佳参数选择
J Big Data. 2021;8(1):1. doi: 10.1186/s40537-020-00387-6. Epub 2021 Jan 3.