• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

评估提示词制定对ChatGPT在种植体支持式修复体方面的可靠性和可重复性的影响。

Evaluating the influence of prompt formulation on the reliability and repeatability of ChatGPT in implant-supported prostheses.

作者信息

Freire Yolanda, Santamaría Laorden Andrea, Orejas Pérez Jaime, Ortiz Collado Ignacio, Gómez Sánchez Margarita, Thuissard Vasallo Israel J, Díaz-Flores García Víctor, Suárez Ana

机构信息

Department of Preclinical Dentistry II, Faculty of Biomedical and Health Sciences, Universidad Europea de Madrid, Villaviciosa de Odón, Madrid, Spain.

Department of Preclinical Dentistry I. Faculty of Biomedical and Health Sciences, Universidad Europea de Madrid, Villaviciosa de Odón, Madrid, Spain.

出版信息

PLoS One. 2025 May 30;20(5):e0323086. doi: 10.1371/journal.pone.0323086. eCollection 2025.

DOI:10.1371/journal.pone.0323086
PMID:40445924
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12124515/
Abstract

Language models (LLMs) such as ChatGPT are widely available to any dental professional. However, there is limited evidence to evaluate the reliability and reproducibility of ChatGPT-4 in relation to implant-supported prostheses, as well as the impact of prompt design on its responses. This constrains understanding of its application within this specific area of dentistry. The purpose of this study was to evaluate the performance of ChatGPT-4 in generating answers about implant-supported prostheses using different prompts. Thirty questions on implant-supported and implant-retained prostheses were posed, with 30 answers generated per question using general and specific prompts, totaling 1800 answers. Experts assessed reliability (agreement with expert grading) and repeatability (response consistency) using a 3-point Likert scale. General prompts achieved 70.89% reliability, with repeatability ranging from moderate to almost perfect. Specific prompts showed higher performance, with 78.8% reliability and substantial to almost perfect repeatability. The specific prompt significantly improved reliability compared to the general prompt. Despite these promising results, ChatGPT's ability to generate reliable answers on implant-supported prostheses remains limited, highlighting the need for professional oversight. Using specific prompts can enhance its performance. The use of a specific prompt might improve the answer generation performance of ChatGPT.

摘要

诸如ChatGPT之类的语言模型(LLMs)已为广大牙科专业人员所使用。然而,关于ChatGPT-4在种植体支持的修复体方面的可靠性和可重复性,以及提示设计对其回答的影响,目前仅有有限的证据可供评估。这限制了我们对其在牙科这一特定领域应用的理解。本研究的目的是评估ChatGPT-4使用不同提示生成关于种植体支持修复体答案的性能。提出了30个关于种植体支持和种植体固位修复体的问题,每个问题使用一般提示和特定提示生成30个答案,共计1800个答案。专家们使用3点李克特量表评估可靠性(与专家评分的一致性)和可重复性(回答的一致性)。一般提示的可靠性达到70.89%,可重复性从中度到几乎完美不等。特定提示表现出更高的性能,可靠性为78.8%,可重复性从实质到几乎完美。与一般提示相比,特定提示显著提高了可靠性。尽管有这些令人鼓舞的结果,但ChatGPT在生成关于种植体支持修复体的可靠答案方面的能力仍然有限,这突出了专业监督的必要性。使用特定提示可以提高其性能。特定提示的使用可能会改善ChatGPT的答案生成性能。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7323/12124515/ed988627c861/pone.0323086.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7323/12124515/4f34a009e1b2/pone.0323086.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7323/12124515/ed988627c861/pone.0323086.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7323/12124515/4f34a009e1b2/pone.0323086.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7323/12124515/ed988627c861/pone.0323086.g002.jpg

相似文献

1
Evaluating the influence of prompt formulation on the reliability and repeatability of ChatGPT in implant-supported prostheses.评估提示词制定对ChatGPT在种植体支持式修复体方面的可靠性和可重复性的影响。
PLoS One. 2025 May 30;20(5):e0323086. doi: 10.1371/journal.pone.0323086. eCollection 2025.
2
ChatGPT performance in prosthodontics: Assessment of accuracy and repeatability in answer generation.ChatGPT 在口腔修复学中的表现:评估其在回答生成中的准确性和可重复性。
J Prosthet Dent. 2024 Apr;131(4):659.e1-659.e6. doi: 10.1016/j.prosdent.2024.01.018. Epub 2024 Feb 2.
3
Evaluating the Influence of Role-Playing Prompts on ChatGPT's Misinformation Detection Accuracy: Quantitative Study.评估角色扮演提示对 ChatGPT 错误信息检测准确率的影响:定量研究。
JMIR Infodemiology. 2024 Sep 26;4:e60678. doi: 10.2196/60678.
4
Assessing the Quality and Reliability of ChatGPT's Responses to Radiotherapy-Related Patient Queries: Comparative Study With GPT-3.5 and GPT-4.评估ChatGPT对放疗相关患者问题回答的质量和可靠性:与GPT-3.5和GPT-4的比较研究
JMIR Cancer. 2025 Apr 16;11:e63677. doi: 10.2196/63677.
5
Assessing question characteristic influences on ChatGPT's performance and response-explanation consistency: Insights from Taiwan's Nursing Licensing Exam.评估问题特征对 ChatGPT 表现和回应解释一致性的影响:来自台湾护理执照考试的见解。
Int J Nurs Stud. 2024 May;153:104717. doi: 10.1016/j.ijnurstu.2024.104717. Epub 2024 Feb 8.
6
Can ChatGPT be trusted as a resource for a scholarly article on treatment planning implant-supported prostheses?ChatGPT能否被视为关于种植体支持修复体治疗计划的学术文章的可靠资源?
J Prosthet Dent. 2025 Apr 9. doi: 10.1016/j.prosdent.2025.03.025.
7
Unveiling the ChatGPT phenomenon: Evaluating the consistency and accuracy of endodontic question answers.揭开ChatGPT现象的面纱:评估牙髓病学问题答案的一致性和准确性。
Int Endod J. 2024 Jan;57(1):108-113. doi: 10.1111/iej.13985. Epub 2023 Oct 9.
8
Optimizing ChatGPT's Interpretation and Reporting of Delirium Assessment Outcomes: Exploratory Study.优化 ChatGPT 对谵妄评估结果的解释和报告:探索性研究。
JMIR Form Res. 2024 Oct 1;8:e51383. doi: 10.2196/51383.
9
An assessment of ChatGPT's responses to frequently asked questions about cervical and breast cancer.评估 ChatGPT 对宫颈癌和乳腺癌常见问题的回答。
BMC Womens Health. 2024 Sep 2;24(1):482. doi: 10.1186/s12905-024-03320-8.
10
Application of Large Language Models in Medical Training Evaluation-Using ChatGPT as a Standardized Patient: Multimetric Assessment.大语言模型在医学培训评估中的应用——以ChatGPT作为标准化病人:多指标评估
J Med Internet Res. 2025 Jan 1;27:e59435. doi: 10.2196/59435.

引用本文的文献

1
Assessing the accuracy, repeatability, and consistency of ChatGPT 4o in treatment planning for tooth-supported fixed prostheses: a comparative analysis of simple and complex clinical cases.评估ChatGPT 4o在牙支持固定修复体治疗计划中的准确性、可重复性和一致性:简单与复杂临床病例的对比分析
Clin Oral Investig. 2025 Sep 2;29(9):433. doi: 10.1007/s00784-025-06521-z.

本文引用的文献

1
Digital versus traditional workflows for fabrication of implant-supported rehabilitation: A systematic review.种植体支持修复体制作的数字化与传统工作流程:一项系统评价。
Bioinformation. 2024 Sep 30;20(9):1075-1085. doi: 10.6026/9732063002001075. eCollection 2024.
2
The TRIPOD-LLM reporting guideline for studies using large language models.使用大语言模型的研究的TRIPOD-LLM报告指南。
Nat Med. 2025 Jan;31(1):60-69. doi: 10.1038/s41591-024-03425-5. Epub 2025 Jan 8.
3
Fixed Full-Arch Implant-Supported Restorations: Techniques Review and Proposal for Improvement.
固定全牙弓种植支持修复体:技术回顾与改进建议
Dent J (Basel). 2024 Dec 13;12(12):408. doi: 10.3390/dj12120408.
4
Large language models in periodontology: Assessing their performance in clinically relevant questions.牙周病学中的大语言模型:评估它们在临床相关问题中的表现。
J Prosthet Dent. 2024 Nov 18. doi: 10.1016/j.prosdent.2024.10.020.
5
Evaluating the efficacy of leading large language models in the Japanese national dental hygienist examination: A comparative analysis of ChatGPT, Bard, and Bing Chat.评估领先的大语言模型在日本国家牙科保健员考试中的功效:ChatGPT、Bard和必应聊天的比较分析。
J Dent Sci. 2024 Oct;19(4):2262-2267. doi: 10.1016/j.jds.2024.02.019. Epub 2024 Feb 29.
6
Optimizing Diagnostic Performance of ChatGPT: The Impact of Prompt Engineering on Thoracic Radiology Cases.优化ChatGPT的诊断性能:提示工程对胸部放射学病例的影响。
Cureus. 2024 May 9;16(5):e60009. doi: 10.7759/cureus.60009. eCollection 2024 May.
7
ChatGPT prompts for generating multiple-choice questions in medical education and evidence on their validity: a literature review.ChatGPT 提示在医学教育中生成多项选择题及其有效性的证据:文献综述。
Postgrad Med J. 2024 Oct 18;100(1189):858-865. doi: 10.1093/postmj/qgae065.
8
Assessing ChatGPT 4.0's test performance and clinical diagnostic accuracy on USMLE STEP 2 CK and clinical case reports.评估ChatGPT 4.0在美国医师执照考试第二步临床知识考试(USMLE STEP 2 CK)及临床病例报告中的测试表现和临床诊断准确性。
Sci Rep. 2024 Apr 23;14(1):9330. doi: 10.1038/s41598-024-58760-x.
9
Evidence-based potential of generative artificial intelligence large language models in orthodontics: a comparative study of ChatGPT, Google Bard, and Microsoft Bing.生成式人工智能大语言模型在正畸学中的循证潜力:ChatGPT、谷歌巴德和微软必应的比较研究
Eur J Orthod. 2024 Apr 13. doi: 10.1093/ejo/cjae017.
10
Zirconia-ceramic versus metal-ceramic implant-supported multiunit fixed dental prostheses: A systematic review and meta-analysis.氧化锆陶瓷与金属陶瓷种植体支持的多单位固定义齿:系统评价与Meta分析。
Dent Res J (Isfahan). 2024 Jan 25;21:5. eCollection 2024.