• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

人工智能聊天机器人在颌面疾病初步诊断中的表现。

Performance of AI Chatbots in Preliminary Diagnosis of Maxillofacial Pathologies.

作者信息

Guler Ridvan, Yalcin Emine

机构信息

Department of Oral and Maxillofacial Surgery, Dicle University Faculty of Dentistry, Diyarbakir, Turkey.

出版信息

Med Sci Monit. 2025 Jul 9;31:e949076. doi: 10.12659/MSM.949076.

DOI:10.12659/MSM.949076
PMID:40629684
Abstract

BACKGROUND Artificial intelligence (AI) has shown significant potential in transforming healthcare by enabling accurate, data-driven decision-making. This study compared the performance of the AI chatbots ChatGPT, Grok, Blackbox, and Claude AI in preliminary diagnosis of maxillofacial pathologies. MATERIAL AND METHODS This study included 23 patients (9 cysts, 14 neoplasms) who underwent operations at Dicle University Faculty of Dentistry between 2017 and 2024 and had their diagnoses histopathologically confirmed. For each case, 4 differential diagnosis options were prepared in question format and directed to the AI platforms. The accuracy of the answers given by the chatbots was analyzed by comparing them with the definitive histopathological diagnoses of the cases. Statistical analysis used the chi-square ad Fisher-Freeman-Halton tests to compare performance among the chatbots. Statistical significance was set at p<0.05. RESULTS ChatGPT answered 15 out of 23 questions correctly, achieving a success rate of 65.2%. Grok and Blackbox AI each achieved a success rate of 52.17%, while Claude AI achieved the lowest success rate, at 30.43%. When cases were categorized into cysts and neoplasms, Blackbox AI showed the highest accuracy for cyst cases (66.6%), while ChatGPT had the highest accuracy for neoplasm cases (71.4%). No statistically significant difference was observed in the distribution of correct and incorrect answers among the chatbots (p=0.125). No statistically significant difference was observed in the distribution of cysts and neoplasms answers among the chatbots (p=0.654). CONCLUSIONS Although all 4 AI chatbots achieved certain levels of accuracy, ChatGPT showed superior performance compared to other chatbots. The development of these chatbots could be beneficial for diagnostic accuracy and treatment recommendations in dentistry.

摘要

背景 人工智能(AI)通过实现准确的、数据驱动的决策,在变革医疗保健方面显示出巨大潜力。本研究比较了人工智能聊天机器人ChatGPT、Grok、Blackbox和Claude AI在颌面部病变初步诊断中的表现。

材料与方法 本研究纳入了23例患者(9例囊肿,14例肿瘤),这些患者于2017年至2024年在迪克莱大学牙科学院接受手术,其诊断经组织病理学证实。针对每个病例,以问题形式准备了4种鉴别诊断选项,并发送至人工智能平台。通过将聊天机器人给出的答案与病例的最终组织病理学诊断进行比较,分析答案的准确性。统计分析使用卡方检验和费舍尔 - 弗里曼 - 哈尔顿检验来比较聊天机器人之间的性能。设定统计学显著性为p<0.05。

结果 ChatGPT在23个问题中正确回答了15个,成功率为65.2%。Grok和Blackbox AI的成功率均为52.17%,而Claude AI的成功率最低,为30.43%。当病例分为囊肿和肿瘤时,Blackbox AI在囊肿病例中显示出最高的准确率(66.6%),而ChatGPT在肿瘤病例中准确率最高(71.4%)。在聊天机器人之间,正确和错误答案的分布没有观察到统计学显著差异(p = 0.125)。在聊天机器人之间,囊肿和肿瘤答案的分布也没有观察到统计学显著差异(p = 0.654)。

结论 尽管所有4个人工智能聊天机器人都达到了一定的准确率水平,但ChatGPT与其他聊天机器人相比表现更优。这些聊天机器人的开发可能有助于提高牙科诊断的准确性和治疗建议。

相似文献

1
Performance of AI Chatbots in Preliminary Diagnosis of Maxillofacial Pathologies.人工智能聊天机器人在颌面疾病初步诊断中的表现。
Med Sci Monit. 2025 Jul 9;31:e949076. doi: 10.12659/MSM.949076.
2
Accuracy and Reliability of Artificial Intelligence Chatbots as Public Information Sources in Implant Dentistry.人工智能聊天机器人作为种植牙科公共信息来源的准确性和可靠性
Int J Oral Maxillofac Implants. 2025 Jun 25;0(0):1-23. doi: 10.11607/jomi.11280.
3
Accuracy of ChatGPT-3.5, ChatGPT-4o, Copilot, Gemini, Claude, and Perplexity in advising on lumbosacral radicular pain against clinical practice guidelines: cross-sectional study.ChatGPT-3.5、ChatGPT-4o、Copilot、Gemini、Claude和Perplexity在依据临床实践指南对腰骶神经根性疼痛提供建议方面的准确性:横断面研究
Front Digit Health. 2025 Jun 27;7:1574287. doi: 10.3389/fdgth.2025.1574287. eCollection 2025.
4
Assessing the diagnostic capacity of artificial intelligence chatbots for dysphonia types: Model development and validation.评估人工智能聊天机器人对嗓音障碍类型的诊断能力:模型开发与验证。
Eur Ann Otorhinolaryngol Head Neck Dis. 2025 Jul;142(4):171-178. doi: 10.1016/j.anorl.2025.01.001. Epub 2025 Feb 18.
5
Artificial Intelligence in Peripheral Artery Disease Education: A Battle Between ChatGPT and Google Gemini.外周动脉疾病教育中的人工智能:ChatGPT与谷歌Gemini的较量
Cureus. 2025 Jun 1;17(6):e85174. doi: 10.7759/cureus.85174. eCollection 2025 Jun.
6
Evaluating the validity and consistency of artificial intelligence chatbots in responding to patients' frequently asked questions in prosthodontics.评估人工智能聊天机器人在回答患者口腔修复学常见问题时的有效性和一致性。
J Prosthet Dent. 2025 Apr 7. doi: 10.1016/j.prosdent.2025.03.009.
7
Parental Perception on Usage of AI Chatbot to Understand Paediatric Otorhinolaryngology Condition: A Survey.家长对使用人工智能聊天机器人了解小儿耳鼻咽喉科疾病的认知:一项调查
Indian J Otolaryngol Head Neck Surg. 2025 May;77(5):2078-2087. doi: 10.1007/s12070-025-05451-2. Epub 2025 Apr 7.
8
Performance of Multimodal Artificial Intelligence Chatbots Evaluated on Clinical Oncology Cases.多模态人工智能聊天机器人在临床肿瘤病例中的性能评估。
JAMA Netw Open. 2024 Oct 1;7(10):e2437711. doi: 10.1001/jamanetworkopen.2024.37711.
9
"Dr. AI Will See You Now": How Do ChatGPT-4 Treatment Recommendations Align With Orthopaedic Clinical Practice Guidelines?“AI 医生为您服务”:ChatGPT-4 的治疗建议与骨科临床实践指南如何契合?
Clin Orthop Relat Res. 2024 Dec 1;482(12):2098-2106. doi: 10.1097/CORR.0000000000003234. Epub 2024 Sep 6.
10
Comparison of ChatGPT and Internet Research for Clinical Research and Decision-Making in Occupational Medicine: Randomized Controlled Trial.ChatGPT与互联网搜索用于职业医学临床研究和决策的比较:随机对照试验
JMIR Form Res. 2025 May 20;9:e63857. doi: 10.2196/63857.

本文引用的文献

1
Artificial Intelligence in Dentistry: A Narrative Review of Diagnostic and Therapeutic Applications.牙科中的人工智能:诊断与治疗应用的叙述性综述
Med Sci Monit. 2025 Apr 8;31:e946676. doi: 10.12659/MSM.946676.
2
Assessing the performance of an artificial intelligence based chatbot in the differential diagnosis of oral mucosal lesions: clinical validation study.评估基于人工智能的聊天机器人在口腔黏膜病变鉴别诊断中的性能:临床验证研究。
Clin Oral Investig. 2025 Mar 18;29(4):188. doi: 10.1007/s00784-025-06268-7.
3
Optimizing autonomous artificial intelligence diagnostics for neuro-ocular health in space missions.
优化太空任务中神经眼部健康的自主人工智能诊断。
Life Sci Space Res (Amst). 2025 Feb;44:64-66. doi: 10.1016/j.lssr.2024.12.004. Epub 2024 Dec 25.
4
Diagnostic performance of ChatGPT-4.0 in histopathological description analysis of oral and maxillofacial lesions: a comparative study with pathologists.ChatGPT-4.0在口腔颌面部病变组织病理学描述分析中的诊断性能:与病理学家的对比研究
Oral Surg Oral Med Oral Pathol Oral Radiol. 2025 Apr;139(4):453-461. doi: 10.1016/j.oooo.2024.11.087. Epub 2024 Nov 28.
5
Evaluating Artificial Intelligence Chatbots in Oral and Maxillofacial Surgery Board Exams: Performance and Potential.评估人工智能聊天机器人在口腔颌面外科医师资格考试中的表现与潜力
J Oral Maxillofac Surg. 2025 Mar;83(3):382-389. doi: 10.1016/j.joms.2024.11.007. Epub 2024 Nov 19.
6
Performance of Artificial Intelligence Chatbots in Answering Clinical Questions on Japanese Practical Guidelines for Implant-based Breast Reconstruction.人工智能聊天机器人在回答基于日本乳房植入重建实用指南的临床问题中的表现
Aesthetic Plast Surg. 2025 Apr;49(7):1947-1953. doi: 10.1007/s00266-024-04515-y. Epub 2024 Nov 26.
7
Radiologic Decision-Making for Imaging in Pulmonary Embolism: Accuracy and Reliability of Large Language Models-Bing, Claude, ChatGPT, and Perplexity.肺栓塞影像学检查的放射学决策:大语言模型——必应、克劳德、ChatGPT和Perplexity的准确性与可靠性
Indian J Radiol Imaging. 2024 Jul 4;34(4):653-660. doi: 10.1055/s-0044-1787974. eCollection 2024 Oct.
8
Claude 3 Opus and ChatGPT With GPT-4 in Dermoscopic Image Analysis for Melanoma Diagnosis: Comparative Performance Analysis.用于黑色素瘤诊断的皮肤镜图像分析中Claude 3 Opus和配备GPT-4的ChatGPT:比较性能分析
JMIR Med Inform. 2024 Aug 6;12:e59273. doi: 10.2196/59273.
9
Diagnostic performances of Claude 3 Opus and Claude 3.5 Sonnet from patient history and key images in Radiology's "Diagnosis Please" cases.Claude 3 Opus 和 Claude 3.5 Sonnet 基于病史和放射科“诊断请”病例关键图像的诊断性能。
Jpn J Radiol. 2024 Dec;42(12):1399-1402. doi: 10.1007/s11604-024-01634-z. Epub 2024 Aug 3.
10
Evaluation of ChatGPT as a diagnostic tool for medical learners and clinicians.评估 ChatGPT 作为医学学习者和临床医生的诊断工具。
PLoS One. 2024 Jul 31;19(7):e0307383. doi: 10.1371/journal.pone.0307383. eCollection 2024.