• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

ChatGPT 在牙科和过敏免疫评估中的表现:一项比较研究。

ChatGPT's performance in dentistry and allergyimmunology assessments: a comparative study.

机构信息

Department of Periodontology, Endodontology and Cariology, University Center for Dental Medicine Basel UZB, University of Basel, Basel, Switzerland.

Division of Allergy, University Children's Hospital Basel, Basel, Switzerland.

出版信息

Swiss Dent J. 2023 Oct 4;134(2):1-17. doi: 10.61872/sdj-2024-06-01.

DOI:10.61872/sdj-2024-06-01
PMID:38726506
Abstract

Large language models (LLMs) such as ChatGPT have potential applications in healthcare, including dentistry. Priming, the practice of providing LLMs with initial, relevant information, is an approach to improve their output quality. This study aimed to evaluate the performance of ChatGPT 3 and ChatGPT 4 on self-assessment questions for dentistry, through the Swiss Federal Licensing Examination in Dental Medicine (SFLEDM), and allergy and clinical immunology, through the European Examination in Allergy and Clinical Immunology (EEAACI). The second objective was to assess the impact of priming on ChatGPT's performance. The SFLEDM and EEAACI multiple-choice questions from the University of Bern's Institute for Medical Education platform were administered to both ChatGPT versions, with and without priming. Performance was analyzed based on correct responses. The statistical analysis included Wilcoxon rank sum tests (alpha=0.05). The average accuracy rates in the SFLEDM and EEAACI assessments were 63.3% and 79.3%, respectively. Both ChatGPT versions performed better on EEAACI than SFLEDM, with ChatGPT 4 outperforming ChatGPT 3 across all tests. ChatGPT 3's performance exhibited a significant improvement with priming for both EEAACI (p=0.017) and SFLEDM (p=0.024) assessments. For ChatGPT 4, the priming effect was significant only in the SFLEDM assessment (p=0.038). The performance disparity between SFLEDM and EEAACI assessments underscores ChatGPT's varying proficiency across different medical domains, likely tied to the nature and amount of training data available in each field. Priming can be a tool for enhancing output, especially in earlier LLMs. Advancements from ChatGPT 3 to 4 highlight the rapid developments in LLM technology. Yet, their use in critical fields such as healthcare must remain cautious owing to LLMs' inherent limitations and risks.

摘要

大型语言模型(LLMs),如 ChatGPT,在医疗保健领域,包括牙科领域,具有潜在的应用。提示,即向 LLM 提供初始相关信息的做法,是一种提高其输出质量的方法。本研究旨在通过瑞士联邦牙医执照考试(SFLEDM)和过敏与临床免疫学欧洲考试(EEAACI)评估 ChatGPT 3 和 ChatGPT 4 在牙科自我评估问题上的表现,以及评估过敏与临床免疫学。第二个目标是评估提示对 ChatGPT 性能的影响。伯尔尼大学医学教育研究所的 SFLEDM 和 EEAACI 选择题分别提供给两个 ChatGPT 版本,包括和不包括提示。性能基于正确答案进行分析。统计分析包括 Wilcoxon 秩和检验(alpha=0.05)。在 SFLEDM 和 EEAACI 评估中,平均准确率分别为 63.3%和 79.3%。两个 ChatGPT 版本在 EEAACI 上的表现均优于 SFLEDM,ChatGPT 4 在所有测试中均优于 ChatGPT 3。ChatGPT 3 在 EEAACI(p=0.017)和 SFLEDM(p=0.024)评估中的表现均因提示而显著提高。对于 ChatGPT 4,提示效果仅在 SFLEDM 评估中显著(p=0.038)。SFLEDM 和 EEAACI 评估之间的性能差异突出了 ChatGPT 在不同医学领域的不同熟练程度,这可能与每个领域可用的训练数据的性质和数量有关。提示是增强输出的一种工具,尤其是在早期的 LLM 中。从 ChatGPT 3 到 4 的进步突出了 LLM 技术的快速发展。然而,由于 LLM 的固有局限性和风险,它们在医疗保健等关键领域的使用必须保持谨慎。

相似文献

1
ChatGPT's performance in dentistry and allergyimmunology assessments: a comparative study.ChatGPT 在牙科和过敏免疫评估中的表现:一项比较研究。
Swiss Dent J. 2023 Oct 4;134(2):1-17. doi: 10.61872/sdj-2024-06-01.
2
ChatGPT's performance in dentistry and allergy-immunology assessments: a comparative study.ChatGPT在牙科和过敏免疫学评估中的表现:一项比较研究。
Swiss Dent J. 2023 Oct 6;134(5).
3
Exploring the Performance of ChatGPT Versions 3.5, 4, and 4 With Vision in the Chilean Medical Licensing Examination: Observational Study.探讨 ChatGPT 版本 3.5、4 和 4 与 Vision 在智利医师执照考试中的表现:观察性研究。
JMIR Med Educ. 2024 Apr 29;10:e55048. doi: 10.2196/55048.
4
Performance of ChatGPT Across Different Versions in Medical Licensing Examinations Worldwide: Systematic Review and Meta-Analysis.ChatGPT 在全球医学执照考试不同版本中的表现:系统评价和荟萃分析。
J Med Internet Res. 2024 Jul 25;26:e60807. doi: 10.2196/60807.
5
Appraisal of ChatGPT's Aptitude for Medical Education: Comparative Analysis With Third-Year Medical Students in a Pulmonology Examination.评估 ChatGPT 在医学教育中的能力:与三年级医学生在肺病学考试中的比较分析。
JMIR Med Educ. 2024 Jul 23;10:e52818. doi: 10.2196/52818.
6
Performance of ChatGPT-3.5 and GPT-4 in national licensing examinations for medicine, pharmacy, dentistry, and nursing: a systematic review and meta-analysis.ChatGPT-3.5 和 GPT-4 在医学、药学、牙科和护理国家执照考试中的表现:系统评价和荟萃分析。
BMC Med Educ. 2024 Sep 16;24(1):1013. doi: 10.1186/s12909-024-05944-8.
7
Performance of ChatGPT on the Chinese Postgraduate Examination for Clinical Medicine: Survey Study.ChatGPT 在临床医学研究生入学考试中的表现:调查研究。
JMIR Med Educ. 2024 Feb 9;10:e48514. doi: 10.2196/48514.
8
ChatGPT's performance in German OB/GYN exams - paving the way for AI-enhanced medical education and clinical practice.ChatGPT在德国妇产科考试中的表现——为人工智能强化医学教育和临床实践铺平道路。
Front Med (Lausanne). 2023 Dec 13;10:1296615. doi: 10.3389/fmed.2023.1296615. eCollection 2023.
9
Evaluation of the Performance of Generative AI Large Language Models ChatGPT, Google Bard, and Microsoft Bing Chat in Supporting Evidence-Based Dentistry: Comparative Mixed Methods Study.评估生成式 AI 大语言模型 ChatGPT、Google Bard 和 Microsoft Bing Chat 在支持循证牙科方面的性能:比较混合方法研究。
J Med Internet Res. 2023 Dec 28;25:e51580. doi: 10.2196/51580.
10
Assessing question characteristic influences on ChatGPT's performance and response-explanation consistency: Insights from Taiwan's Nursing Licensing Exam.评估问题特征对 ChatGPT 表现和回应解释一致性的影响:来自台湾护理执照考试的见解。
Int J Nurs Stud. 2024 May;153:104717. doi: 10.1016/j.ijnurstu.2024.104717. Epub 2024 Feb 8.

引用本文的文献

1
Generative Pre-trained Transformer: Trends, Applications, Strengths and Challenges in Dentistry: A Systematic Review.生成式预训练变换器:牙科领域的趋势、应用、优势与挑战:一项系统综述
Healthc Inform Res. 2025 Apr;31(2):189-199. doi: 10.4258/hir.2025.31.2.189. Epub 2025 Apr 30.
2
Can Large Language Models Serve as Reliable Tools for Information in Dentistry? A Systematic Review.大语言模型能否作为牙科领域可靠的信息工具?一项系统综述。
Int Dent J. 2025 May 16;75(4):100835. doi: 10.1016/j.identj.2025.04.015.
3
Comparative analysis of ChatGPT and Gemini (Bard) in medical inquiry: a scoping review.
ChatGPT与Gemini(巴德)在医学问诊中的比较分析:一项范围综述
Front Digit Health. 2025 Feb 3;7:1482712. doi: 10.3389/fdgth.2025.1482712. eCollection 2025.
4
Transforming dental diagnostics with artificial intelligence: advanced integration of ChatGPT and large language models for patient care.利用人工智能变革牙科诊断:ChatGPT与大语言模型在患者护理中的深度整合
Front Dent Med. 2025 Jan 6;5:1456208. doi: 10.3389/fdmed.2024.1456208. eCollection 2024.
5
Accuracy of latest large language models in answering multiple choice questions in dentistry: A comparative study.最新大语言模型在回答牙科多项选择题方面的准确性:一项比较研究。
PLoS One. 2025 Jan 29;20(1):e0317423. doi: 10.1371/journal.pone.0317423. eCollection 2025.
6
Performance of GPT-3.5 and GPT-4 on the Korean Pharmacist Licensing Examination: Comparison Study.GPT-3.5和GPT-4在韩国药剂师执照考试中的表现:比较研究。
JMIR Med Educ. 2024 Dec 4;10:e57451. doi: 10.2196/57451.
7
Large Language Models in Dental Licensing Examinations: Systematic Review and Meta-Analysis.大型语言模型在牙科执照考试中的应用:系统评价与荟萃分析
Int Dent J. 2025 Feb;75(1):213-222. doi: 10.1016/j.identj.2024.10.014. Epub 2024 Nov 12.
8
Performance of ChatGPT-3.5 and GPT-4 in national licensing examinations for medicine, pharmacy, dentistry, and nursing: a systematic review and meta-analysis.ChatGPT-3.5 和 GPT-4 在医学、药学、牙科和护理国家执照考试中的表现:系统评价和荟萃分析。
BMC Med Educ. 2024 Sep 16;24(1):1013. doi: 10.1186/s12909-024-05944-8.
9
A Preliminary Checklist (METRICS) to Standardize the Design and Reporting of Studies on Generative Artificial Intelligence-Based Models in Health Care Education and Practice: Development Study Involving a Literature Review.一份用于规范基于生成式人工智能模型的医疗保健教育与实践研究设计和报告的初步清单(METRICS):涉及文献综述的开发研究
Interact J Med Res. 2024 Feb 15;13:e54704. doi: 10.2196/54704.
10
Comparing Artificial Intelligence and Senior Residents in Oral Lesion Diagnosis: A Comparative Study.人工智能与住院医师在口腔病变诊断中的比较:一项对比研究。
Cureus. 2024 Jan 3;16(1):e51584. doi: 10.7759/cureus.51584. eCollection 2024 Jan.