• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

人工智能聊天机器人提供的关于口腔外颌面修复体信息的评估

Evaluation of information provided by artificial intelligence chatbots on extraoral maxillofacial prostheses.

作者信息

Özyemişci Nuran, Bal Bilge Turhan, Güngör Merve Bankoğlu, Öztürk Esra Kaynak, Canvar Ayşegül, Nemli Secil Karakoca

机构信息

Associate Professor, Dental Prosthesis Technology, Vocational School of Health Services, Hacettepe University, Ankara, Turkey.

Professor, Department of Prosthodontics, Faculty of Dentistry, Gazi University, Ankara, Turkey.

出版信息

J Prosthet Dent. 2025 Sep 8. doi: 10.1016/j.prosdent.2025.08.028.

DOI:10.1016/j.prosdent.2025.08.028
PMID:40925817
Abstract

STATEMENT OF PROBLEM

Despite advances in artificial intelligence (AI), the quality, reliability, and understandability of health-related information provided by chatbots is still a question mark. Furthermore, studies on maxillofacial prosthesis (MP) information from AI chatbots are lacking.

PURPOSE

The purpose of this study was to assess and compare the reliability, quality, readability, and similarity of responses to MP-related questions generated by 4 different chatbots.

MATERIAL AND METHODS

A total of 15 questions were provided by a maxillofacial prosthodontist and from 4 different chatbots (ChatGPT-3.5, Gemini 2.5 Flash, Copilot, and DeepSeek V3). The Reliability Scoring (adapted DISCERN), the Global Quality Scale (GQS), the Flesch Reading Ease Score (FRES), the Flesch-Kincaid Reading Grade Level (FKRGL), and the Similarity Index (iThenticate) were used to evaluate the performance of chatbots. Data were compared using the Kruskal-Wallis test, and the differences between chatbots were determined by the Conover multiple comparison test with Benjamini-Hochberg correction (α=.05).

RESULTS

There were no significant differences between the chatbots' DISCERN scores, except for one question where ChatGPT showed significantly higher reliability than Gemini or Copilot (P=.03). There was no statistically significant difference among AI tools in terms of GQS values (P=.096), FRES values (P=.166), and FKRGL values (P=.247). The similarity rate of Gemini was statistically higher than other AI chatbots (P=.03).

CONCLUSIONS

ChatGPT-3.5, Gemini 2.5 Flash, Copilot, and DeepSeek V3 showed good quality responses. All chatbots' responses were difficult for non-professionals to read and understand. Low similarity rates were found for all chatbots except Gemini, indicating originality of their information.

摘要

问题陈述

尽管人工智能(AI)取得了进展,但聊天机器人提供的与健康相关信息的质量、可靠性和可理解性仍是个问号。此外,缺乏对来自人工智能聊天机器人的颌面修复体(MP)信息的研究。

目的

本研究的目的是评估和比较4种不同聊天机器人对MP相关问题的回答的可靠性、质量、可读性和相似度。

材料与方法

一位口腔颌面修复医生提供了总共15个问题,并由4种不同的聊天机器人(ChatGPT-3.5、Gemini 2.5 Flash、Copilot和DeepSeek V3)进行回答。使用可靠性评分(改编的DISCERN)、全球质量量表(GQS)、弗莱什阅读简易度评分(FRES)、弗莱什-金凯德阅读年级水平(FKRGL)和相似度指数(iThenticate)来评估聊天机器人的性能。使用Kruskal-Wallis检验比较数据,并通过带有Benjamini-Hochberg校正的Conover多重比较检验确定聊天机器人之间的差异(α = 0.05)。

结果

除了一个问题ChatGPT的可靠性显著高于Gemini或Copilot(P = 0.03)外,聊天机器人的DISCERN评分之间没有显著差异。在GQS值(P = 0.096)、FRES值(P = 0.166)和FKRGL值(P = 0.247)方面,人工智能工具之间没有统计学上的显著差异。Gemini的相似度率在统计学上高于其他人工智能聊天机器人(P = 0.03)。

结论

ChatGPT-3.5、Gemini 2.5 Flash、Copilot和DeepSeek V3显示出质量良好的回答。所有聊天机器人的回答对于非专业人士来说都难以阅读和理解。除Gemini外,所有聊天机器人的相似度率都很低,表明其信息具有原创性。

相似文献

1
Evaluation of information provided by artificial intelligence chatbots on extraoral maxillofacial prostheses.人工智能聊天机器人提供的关于口腔外颌面修复体信息的评估
J Prosthet Dent. 2025 Sep 8. doi: 10.1016/j.prosdent.2025.08.028.
2
Evaluation of Information Provided by ChatGPT Versions on Traumatic Dental Injuries for Dental Students and Professionals.评估ChatGPT不同版本为牙科学生和专业人员提供的有关创伤性牙损伤的信息。
Dent Traumatol. 2025 Aug;41(4):427-436. doi: 10.1111/edt.13042. Epub 2025 Jan 23.
3
Readability, reliability and quality of responses generated by ChatGPT, gemini, and perplexity for the most frequently asked questions about pain.ChatGPT、Gemini和Perplexity针对最常见疼痛问题生成的回答的可读性、可靠性和质量。
Medicine (Baltimore). 2025 Mar 14;104(11):e41780. doi: 10.1097/MD.0000000000041780.
4
Accuracy and Reliability of Artificial Intelligence Chatbots as Public Information Sources in Implant Dentistry.人工智能聊天机器人作为种植牙科公共信息来源的准确性和可靠性
Int J Oral Maxillofac Implants. 2025 Jun 25;0(0):1-23. doi: 10.11607/jomi.11280.
5
Evaluating artificial intelligence chatbots' responses to gynecomastia inquiries: Comparative study of information quality, readability, and guideline consistency.评估人工智能聊天机器人对男性乳房发育症咨询的回复:信息质量、可读性和指南一致性的比较研究
Digit Health. 2025 Aug 26;11:20552076251367645. doi: 10.1177/20552076251367645. eCollection 2025 Jan-Dec.
6
Evaluating AI chatbots in penis enhancement information: a comparative analysis of readability, reliability and quality.评估人工智能聊天机器人在阴茎增大信息方面的表现:可读性、可靠性和质量的比较分析。
Int J Impot Res. 2025 Jun 3. doi: 10.1038/s41443-025-01098-3.
7
Benchmarking AI Chatbots for Maternal Lactation Support: A Cross-Platform Evaluation of Quality, Readability, and Clinical Accuracy.用于产妇泌乳支持的人工智能聊天机器人基准测试:质量、可读性和临床准确性的跨平台评估
Healthcare (Basel). 2025 Jul 20;13(14):1756. doi: 10.3390/healthcare13141756.
8
Artificial Intelligence in Peripheral Artery Disease Education: A Battle Between ChatGPT and Google Gemini.外周动脉疾病教育中的人工智能:ChatGPT与谷歌Gemini的较量
Cureus. 2025 Jun 1;17(6):e85174. doi: 10.7759/cureus.85174. eCollection 2025 Jun.
9
Accuracy of ChatGPT-3.5, ChatGPT-4o, Copilot, Gemini, Claude, and Perplexity in advising on lumbosacral radicular pain against clinical practice guidelines: cross-sectional study.ChatGPT-3.5、ChatGPT-4o、Copilot、Gemini、Claude和Perplexity在依据临床实践指南对腰骶神经根性疼痛提供建议方面的准确性:横断面研究
Front Digit Health. 2025 Jun 27;7:1574287. doi: 10.3389/fdgth.2025.1574287. eCollection 2025.
10
Evaluating the readability, quality, and reliability of responses generated by ChatGPT, Gemini, and Perplexity on the most commonly asked questions about Ankylosing spondylitis.评估ChatGPT、Gemini和Perplexity针对强直性脊柱炎最常见问题生成的回答的可读性、质量和可靠性。
PLoS One. 2025 Jun 18;20(6):e0326351. doi: 10.1371/journal.pone.0326351. eCollection 2025.