文献检索文档翻译深度研究
Suppr Zotero 插件Zotero 插件
邀请有礼套餐&价格历史记录

新学期,新优惠

限时优惠:9月1日-9月22日

30天高级会员仅需29元

1天体验卡首发特惠仅需5.99元

了解详情
不再提醒
插件&应用
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
高级版
套餐订阅购买积分包
AI 工具
文献检索文档翻译深度研究
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2025

Performance of AI-Chatbots to Common Temporomandibular Joint Disorders (TMDs) Patient Queries: Accuracy, Completeness, Reliability and Readability.

作者信息

Hassan Mohamed G, Abdelaziz Ahmed A, Abdelrahman Hams H, Mohamed Mostafa M Y, Ellabban Mohamed T

机构信息

Division of Bone and Mineral Diseases, Department of Medicine, School of Medicine, Washington University in St. Louis, St. Louis, Missouri, USA.

Department of Orthodontics, Faculty of Dentistry, Assiut University, Assiut, Egypt.

出版信息

Orthod Craniofac Res. 2025 May 7. doi: 10.1111/ocr.12939.


DOI:10.1111/ocr.12939
PMID:40332142
Abstract

TMDs are a common group of conditions affecting the temporomandibular joint (TMJ) often resulting from factors like injury, stress or teeth grinding. This study aimed to evaluate the accuracy, completeness, reliability and readability of the responses generated by ChatGPT-3.5, -4o and Google Gemini to TMD-related inquiries. Forty-five questions covering various aspects of TMDs were created by two experts and submitted by one author to ChatGPT-3.5, ChatGPT-4 and Google Gemini on the same day. The responses were evaluated for accuracy, completeness and reliability using modified Likert scales. Readability was analysed with six validated indices via a specialised tool. Additional features, such as the inclusion of graphical elements, references and safeguard mechanisms, were also documented and analysed. The Pearson Chi-Square and One-Way ANOVA tests were used for data analysis. Google Gemini achieved the highest accuracy, providing 100% correct responses, followed by ChatGPT-3.5 (95.6%) and ChatGPT-4o (93.3%). ChatGPT-4o provided the most complete responses (91.1%), followed by ChatGPT-03 (64.4%) and Google Gemini (42.2%). The majority of responses were reliable, with ChatGPT-4o at 93.3% 'Absolutely Reliable', compared to 46.7% for ChatGPT-3.5 and 48.9% for Google Gemini. Both ChatGPT-4o and Google Gemini included references in responses, 22.2% and 13.3%, respectively, while ChatGPT-3.5 included none. Google Gemini was the only model that included multimedia (6.7%). Readability scores were highest for ChatGPT-3.5, suggesting its responses were more complex than those of Google Gemini and ChatGPT-4o. Both ChatGPT-4o and Google Gemini demonstrated accuracy and reliability in addressing TMD-related questions, with their responses being clear, easy to understand and complemented by safeguard statements encouraging specialist consultation. However, both platforms lacked evidence-based references. Only Google Gemini incorporated multimedia elements into its answers.

摘要

相似文献

[1]
Performance of AI-Chatbots to Common Temporomandibular Joint Disorders (TMDs) Patient Queries: Accuracy, Completeness, Reliability and Readability.

Orthod Craniofac Res. 2025-5-7

[2]
A Comparative Analysis of Artificial Intelligence Platforms: ChatGPT-4o and Google Gemini in Answering Questions About Birth Control Methods.

Cureus. 2025-1-1

[3]
The use of ChatGPT and Google Gemini in responding to orthognathic surgery-related questions: A comparative study.

J World Fed Orthod. 2025-2

[4]
Comparative performance of artificial intelligence models in rheumatology board-level questions: evaluating Google Gemini and ChatGPT-4o.

Clin Rheumatol. 2024-11

[5]
Performance of the ChatGPT-3.5, ChatGPT-4, and Google Gemini large language models in responding to dental implantology inquiries.

J Prosthet Dent. 2025-1-4

[6]
Evaluating ChatGPT and Google Gemini Performance and Implications in Turkish Dental Education.

Cureus. 2025-1-11

[7]
Evaluating the Accuracy, Reliability, Consistency, and Readability of Different Large Language Models in Restorative Dentistry.

J Esthet Restor Dent. 2025-7

[8]
Dr. Chatbot: Investigating the Quality and Quantity of Responses Generated by Three AI Chatbots to Prompts Regarding Carpal Tunnel Syndrome.

Cureus. 2025-3-24

[9]
Evaluating the Efficacy of Artificial Intelligence-Driven Chatbots in Addressing Queries on Vernal Conjunctivitis.

Cureus. 2025-2-26

[10]
Comparative analysis of ChatGPT-4o mini, ChatGPT-4o and Gemini Advanced in the treatment of postmenopausal osteoporosis.

BMC Musculoskelet Disord. 2025-4-16

引用本文的文献

[1]
Diagnostic Performance of ChatGPT-4o in Analyzing Oral Mucosal Lesions: A Comparative Study with Experts.

Medicina (Kaunas). 2025-7-30

[2]
Reliability of Large Language Model-Based Chatbots Versus Clinicians as Sources of Information on Orthodontics: A Comparative Analysis.

Dent J (Basel). 2025-7-24

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

推荐工具

医学文档翻译智能文献检索