评估ChatGPT-4和Gemini对世界牙科联盟关于口腔健康常见问题的回答的准确性。

Evaluation of the accuracy of ChatGPT-4 and Gemini's responses to the World Dental Federation's frequently asked questions on oral health.

作者信息

Arpaci Aysenur, Ozturk Asel Usdat, Okur Ismail, Sadry Sanaz

机构信息

Faculty of Dentistry, Department of Periodontology, Istanbul Aydin University, Istanbul, Turkey.

Faculty of Dentistry, Department of Maxillofacial Radiology, Istanbul Atlas University, Istanbul, Turkey.

出版信息

BMC Oral Health. 2025 Aug 2;25(1):1293. doi: 10.1186/s12903-025-06624-9.

DOI:10.1186/s12903-025-06624-9

PMID:40753419

Abstract

BACKGROUND

The field of artificial intelligence (AI) has experienced considerable growth in recent years, with the advent of technologies that are transforming a range of industries, including healthcare and dentistry. Large language models (LLMs) and natural language processing (NLP) are pivotal to this transformation. This study aimed to assess the efficacy of AI-supported chatbots in responding to questions frequently asked by patients to their doctors regarding oral health.

METHODS

Frequently asked questions in the oral health section of the World Dental Federation FDI website were asked about Google-Gemini Trends and ChatGPT-4 chatbots on July 9, 2024. Responses from ChatGPT and Gemini, as well as those from the FDI webpage, were recorded. The accuracy of the responses given by ChatGPT-4 and Gemini to the four specified questions, the detection of similarities and differences, and the comprehensive examination of ChatGPT-4 and Gemini's capabilities were analyzed and reported by the researchers. Furthermore, the content of the texts was evaluated in terms of their similarity with respect to the following criteria: "Main Idea," "Quality Analysis," "Common Ideas," and "Inconsistent Ideas."

RESULTS

It was observed that both ChatGPT-4 and Gemini exhibited performance comparable to that of the FDI responses in terms of completeness and clarity. Compared with Gemini, ChatGPT-4 provided responses that were more similar to the FDI responses in terms of relevance. Furthermore, ChatGPT-4 provided responses that were more accurate than those of Gemini in terms of the "Accuracy" criterion.

CONCLUSIONS

This study demonstrated that, according to the assessment conducted by FDI, the ChatGPT-4 and Gemini applications contain contemporary and comprehensible information in response to general inquiries concerning oral health. These applications are regarded as a prevalent and dependable source of information for individuals seeking to access such data.

摘要

背景

近年来，随着人工智能（AI）技术的出现，该领域取得了显著发展，这些技术正在改变包括医疗保健和牙科在内的一系列行业。大语言模型（LLMs）和自然语言处理（NLP）对这一转变至关重要。本研究旨在评估人工智能支持的聊天机器人在回答患者向医生频繁询问的有关口腔健康问题方面的效果。

方法

2024年7月9日，在世界牙科联盟（FDI）网站的口腔健康板块中常见的问题被输入到谷歌Gemini Trends和ChatGPT-4聊天机器人中。记录了ChatGPT和Gemini的回答，以及FDI网页的回答。研究人员分析并报告了ChatGPT-4和Gemini对四个指定问题的回答准确性、异同检测以及对ChatGPT-4和Gemini能力的综合考察。此外，还根据以下标准对文本内容的相似性进行了评估：“主要观点”、“质量分析”、“共同观点”和“不一致观点”。

结果

观察到ChatGPT-4和Gemini在完整性和清晰度方面的表现与FDI的回答相当。与Gemini相比，ChatGPT-4在相关性方面提供的回答与FDI的回答更相似。此外，在“准确性”标准方面，ChatGPT-4提供的回答比Gemini更准确。

结论

本研究表明根据FDI的评估，ChatGPT-4和Gemini应用程序包含了当代且易于理解的信息，以回应有关口腔健康的一般询问。这些应用程序被视为寻求此类数据的个人普遍且可靠的信息来源。

相似文献

Evaluation of the accuracy of ChatGPT-4 and Gemini's responses to the World Dental Federation's frequently asked questions on oral health.

BMC Oral Health. 2025 Aug 2;25(1):1293. doi: 10.1186/s12903-025-06624-9.

Artificial Intelligence in Peripheral Artery Disease Education: A Battle Between ChatGPT and Google Gemini.

Cureus. 2025 Jun 1;17(6):e85174. doi: 10.7759/cureus.85174. eCollection 2025 Jun.

Accuracy and Reliability of Artificial Intelligence Chatbots as Public Information Sources in Implant Dentistry.

Int J Oral Maxillofac Implants. 2025 Jun 25;0(0):1-23. doi: 10.11607/jomi.11280.

Performance of 3 Conversational Generative Artificial Intelligence Models for Computing Maximum Safe Doses of Local Anesthetics: Comparative Analysis.

JMIR AI. 2025 May 13;4:e66796. doi: 10.2196/66796.

Evaluating the readability, quality, and reliability of responses generated by ChatGPT, Gemini, and Perplexity on the most commonly asked questions about Ankylosing spondylitis.

PLoS One. 2025 Jun 18;20(6):e0326351. doi: 10.1371/journal.pone.0326351. eCollection 2025.

Comparison of Responses from ChatGPT-4, Google Gemini, and Google Search to Common Patient Questions About Ankle Sprains: A Readability Analysis.

J Am Acad Orthop Surg. 2025 Jul 3;33(16):924-930. doi: 10.5435/JAAOS-D-25-00260.

Evaluating the Performance of State-of-the-Art Artificial Intelligence Chatbots Based on the WHO Global Guidelines for the Prevention of Surgical Site Infection: Cross-Sectional Study.

J Med Internet Res. 2025 Jul 31;27:e75567. doi: 10.2196/75567.

Psychological First Aid by AI: Proof-of-Concept and Comparative Performance of ChatGPT-4 and Gemini in Different Disaster Scenarios.

J Clin Psychol. 2025 Aug;81(8):726-738. doi: 10.1002/jclp.23808. Epub 2025 May 9.

A multi-dimensional performance evaluation of large language models in dental implantology: comparison of ChatGPT, DeepSeek, Grok, Gemini and Qwen across diverse clinical scenarios.

BMC Oral Health. 2025 Jul 28;25(1):1272. doi: 10.1186/s12903-025-06619-6.

Thyroid Eye Disease and Artificial Intelligence: A Comparative Study of ChatGPT-3.5, ChatGPT-4o, and Gemini in Patient Information Delivery.

Ophthalmic Plast Reconstr Surg. 2024 Dec 24. doi: 10.1097/IOP.0000000000002882.

本文引用的文献

Large language models in health care: Development, applications, and challenges.

Health Care Sci. 2023 Jul 24;2(4):255-263. doi: 10.1002/hcs2.61. eCollection 2023 Aug.

Evaluating the accuracy of Chat Generative Pre-trained Transformer version 4 (ChatGPT-4) responses to United States Food and Drug Administration (FDA) frequently asked questions about dental amalgam.

BMC Oral Health. 2024 May 24;24(1):605. doi: 10.1186/s12903-024-04358-8.

ChatGPT performance in prosthodontics: Assessment of accuracy and repeatability in answer generation.

J Prosthet Dent. 2024 Apr;131(4):659.e1-659.e6. doi: 10.1016/j.prosdent.2024.01.018. Epub 2024 Feb 2.

Examination of the reliability and readability of Chatbot Generative Pretrained Transformer's (ChatGPT) responses to questions about orthodontics and the evolution of these responses in an updated version.

Am J Orthod Dentofacial Orthop. 2024 May;165(5):546-555. doi: 10.1016/j.ajodo.2023.11.012. Epub 2024 Feb 1.

Performance of Generative Artificial Intelligence in Dental Licensing Examinations.

Int Dent J. 2024 Jun;74(3):616-621. doi: 10.1016/j.identj.2023.12.007. Epub 2024 Jan 19.

A systematic review and meta-analysis on ChatGPT and its utilization in medical and dental research.

Heliyon. 2023 Nov 29;9(12):e23050. doi: 10.1016/j.heliyon.2023.e23050. eCollection 2023 Dec.

Unveiling the ChatGPT phenomenon: Evaluating the consistency and accuracy of endodontic question answers.

Int Endod J. 2024 Jan;57(1):108-113. doi: 10.1111/iej.13985. Epub 2023 Oct 9.

ChatGPT- versus human-generated answers to frequently asked questions about diabetes: A Turing test-inspired survey among employees of a Danish diabetes center.

PLoS One. 2023 Aug 31;18(8):e0290773. doi: 10.1371/journal.pone.0290773. eCollection 2023.

Accuracy of ChatGPT-Generated Information on Head and Neck and Oromaxillofacial Surgery: A Multicenter Collaborative Analysis.

Otolaryngol Head Neck Surg. 2024 Jun;170(6):1492-1503. doi: 10.1002/ohn.489. Epub 2023 Aug 18.

Artificial Intelligence and Public Health: Evaluating ChatGPT Responses to Vaccination Myths and Misconceptions.

Vaccines (Basel). 2023 Jul 7;11(7):1217. doi: 10.3390/vaccines11071217.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

评估ChatGPT-4和Gemini对世界牙科联盟关于口腔健康常见问题的回答的准确性。

Evaluation of the accuracy of ChatGPT-4 and Gemini's responses to the World Dental Federation's frequently asked questions on oral health.

作者信息

机构信息

出版信息

BACKGROUND

METHODS

RESULTS

CONCLUSIONS

背景

方法

结果

结论

相似文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

本文引用的文献