• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

人工智能聊天机器人对常见颞下颌关节紊乱病(TMDs)患者问题的回答:准确性、完整性、可靠性和可读性。

Performance of AI-Chatbots to Common Temporomandibular Joint Disorders (TMDs) Patient Queries: Accuracy, Completeness, Reliability and Readability.

作者信息

Hassan Mohamed G, Abdelaziz Ahmed A, Abdelrahman Hams H, Mohamed Mostafa M Y, Ellabban Mohamed T

机构信息

Division of Bone and Mineral Diseases, Department of Medicine, School of Medicine, Washington University in St. Louis, St. Louis, Missouri, USA.

Department of Orthodontics, Faculty of Dentistry, Assiut University, Assiut, Egypt.

出版信息

Orthod Craniofac Res. 2025 May 7. doi: 10.1111/ocr.12939.

DOI:10.1111/ocr.12939
PMID:40332142
Abstract

TMDs are a common group of conditions affecting the temporomandibular joint (TMJ) often resulting from factors like injury, stress or teeth grinding. This study aimed to evaluate the accuracy, completeness, reliability and readability of the responses generated by ChatGPT-3.5, -4o and Google Gemini to TMD-related inquiries. Forty-five questions covering various aspects of TMDs were created by two experts and submitted by one author to ChatGPT-3.5, ChatGPT-4 and Google Gemini on the same day. The responses were evaluated for accuracy, completeness and reliability using modified Likert scales. Readability was analysed with six validated indices via a specialised tool. Additional features, such as the inclusion of graphical elements, references and safeguard mechanisms, were also documented and analysed. The Pearson Chi-Square and One-Way ANOVA tests were used for data analysis. Google Gemini achieved the highest accuracy, providing 100% correct responses, followed by ChatGPT-3.5 (95.6%) and ChatGPT-4o (93.3%). ChatGPT-4o provided the most complete responses (91.1%), followed by ChatGPT-03 (64.4%) and Google Gemini (42.2%). The majority of responses were reliable, with ChatGPT-4o at 93.3% 'Absolutely Reliable', compared to 46.7% for ChatGPT-3.5 and 48.9% for Google Gemini. Both ChatGPT-4o and Google Gemini included references in responses, 22.2% and 13.3%, respectively, while ChatGPT-3.5 included none. Google Gemini was the only model that included multimedia (6.7%). Readability scores were highest for ChatGPT-3.5, suggesting its responses were more complex than those of Google Gemini and ChatGPT-4o. Both ChatGPT-4o and Google Gemini demonstrated accuracy and reliability in addressing TMD-related questions, with their responses being clear, easy to understand and complemented by safeguard statements encouraging specialist consultation. However, both platforms lacked evidence-based references. Only Google Gemini incorporated multimedia elements into its answers.

摘要

颞下颌关节紊乱病(TMDs)是一组常见的影响颞下颌关节(TMJ)的病症,通常由损伤、压力或磨牙等因素引起。本研究旨在评估ChatGPT-3.5、ChatGPT-4和谷歌Gemini对TMD相关询问所生成回答的准确性、完整性、可靠性和可读性。两位专家创建了涵盖TMD各个方面的45个问题,并由一位作者在同一天提交给ChatGPT-3.5、ChatGPT-4和谷歌Gemini。使用修改后的李克特量表对回答的准确性、完整性和可靠性进行评估。通过一个专门工具用六个经过验证的指标分析可读性。还记录并分析了其他特征,如是否包含图形元素、参考文献和保障机制。数据分析使用了Pearson卡方检验和单因素方差分析。谷歌Gemini的准确性最高,提供了100%的正确回答,其次是ChatGPT-3.5(95.6%)和ChatGPT-4(93.3%)。ChatGPT-4提供了最完整的回答(91.1%),其次是ChatGPT-03(64.4%)和谷歌Gemini(42.2%)。大多数回答是可靠的,ChatGPT-4为93.3%“绝对可靠”,而ChatGPT-3.5为46.7%,谷歌Gemini为48.9%。ChatGPT-4和谷歌Gemini在回答中都包含参考文献,分别为22.2%和13.3%,而ChatGPT-3.5则没有。谷歌Gemini是唯一包含多媒体的模型(6.7%)。ChatGPT-3.5的可读性得分最高,表明其回答比谷歌Gemini和ChatGPT-4的回答更复杂。ChatGPT-4和谷歌Gemini在回答TMD相关问题时都表现出准确性和可靠性,其回答清晰易懂,并辅以鼓励咨询专家的保障声明。然而,两个平台都缺乏基于证据的参考文献。只有谷歌Gemini在其回答中纳入了多媒体元素。

相似文献

1
Performance of AI-Chatbots to Common Temporomandibular Joint Disorders (TMDs) Patient Queries: Accuracy, Completeness, Reliability and Readability.人工智能聊天机器人对常见颞下颌关节紊乱病(TMDs)患者问题的回答:准确性、完整性、可靠性和可读性。
Orthod Craniofac Res. 2025 May 7. doi: 10.1111/ocr.12939.
2
A Comparative Analysis of Artificial Intelligence Platforms: ChatGPT-4o and Google Gemini in Answering Questions About Birth Control Methods.人工智能平台的比较分析:ChatGPT-4o与谷歌Gemini在回答避孕方法相关问题方面的表现
Cureus. 2025 Jan 1;17(1):e76745. doi: 10.7759/cureus.76745. eCollection 2025 Jan.
3
The use of ChatGPT and Google Gemini in responding to orthognathic surgery-related questions: A comparative study.ChatGPT与谷歌Gemini在回答正颌外科相关问题中的应用:一项比较研究。
J World Fed Orthod. 2025 Feb;14(1):20-26. doi: 10.1016/j.ejwf.2024.09.004. Epub 2024 Oct 28.
4
Comparative performance of artificial intelligence models in rheumatology board-level questions: evaluating Google Gemini and ChatGPT-4o.人工智能模型在风湿病委员会级问题中的比较性能:评估 Google Gemini 和 ChatGPT-4o。
Clin Rheumatol. 2024 Nov;43(11):3507-3513. doi: 10.1007/s10067-024-07154-5. Epub 2024 Sep 28.
5
Performance of the ChatGPT-3.5, ChatGPT-4, and Google Gemini large language models in responding to dental implantology inquiries.ChatGPT-3.5、ChatGPT-4和谷歌Gemini大型语言模型在回答牙种植学相关问题方面的表现。
J Prosthet Dent. 2025 Jan 4. doi: 10.1016/j.prosdent.2024.12.016.
6
Evaluating ChatGPT and Google Gemini Performance and Implications in Turkish Dental Education.评估ChatGPT和谷歌Gemini在土耳其牙科教育中的性能及影响
Cureus. 2025 Jan 11;17(1):e77292. doi: 10.7759/cureus.77292. eCollection 2025 Jan.
7
Evaluating the Accuracy, Reliability, Consistency, and Readability of Different Large Language Models in Restorative Dentistry.评估不同大语言模型在口腔修复学中的准确性、可靠性、一致性和可读性。
J Esthet Restor Dent. 2025 Jul;37(7):1740-1752. doi: 10.1111/jerd.13447. Epub 2025 Mar 2.
8
Dr. Chatbot: Investigating the Quality and Quantity of Responses Generated by Three AI Chatbots to Prompts Regarding Carpal Tunnel Syndrome.聊天机器人博士:调查三款人工智能聊天机器人针对腕管综合征提示所生成回复的质量和数量。
Cureus. 2025 Mar 24;17(3):e81068. doi: 10.7759/cureus.81068. eCollection 2025 Mar.
9
Evaluating the Efficacy of Artificial Intelligence-Driven Chatbots in Addressing Queries on Vernal Conjunctivitis.评估人工智能驱动的聊天机器人在解答春季结膜炎相关问题方面的效果。
Cureus. 2025 Feb 26;17(2):e79688. doi: 10.7759/cureus.79688. eCollection 2025 Feb.
10
Comparative analysis of ChatGPT-4o mini, ChatGPT-4o and Gemini Advanced in the treatment of postmenopausal osteoporosis.ChatGPT-4o mini、ChatGPT-4o与Gemini Advanced在绝经后骨质疏松症治疗中的对比分析。
BMC Musculoskelet Disord. 2025 Apr 16;26(1):369. doi: 10.1186/s12891-025-08601-3.

引用本文的文献

1
Diagnostic Performance of ChatGPT-4o in Analyzing Oral Mucosal Lesions: A Comparative Study with Experts.ChatGPT-4o在分析口腔黏膜病变中的诊断性能:与专家的比较研究
Medicina (Kaunas). 2025 Jul 30;61(8):1379. doi: 10.3390/medicina61081379.
2
Reliability of Large Language Model-Based Chatbots Versus Clinicians as Sources of Information on Orthodontics: A Comparative Analysis.基于大语言模型的聊天机器人与临床医生作为正畸学信息来源的可靠性:一项比较分析。
Dent J (Basel). 2025 Jul 24;13(8):343. doi: 10.3390/dj13080343.