文献检索文档翻译深度研究
Suppr Zotero 插件Zotero 插件
邀请有礼套餐&价格历史记录

新学期,新优惠

限时优惠:9月1日-9月22日

30天高级会员仅需29元

1天体验卡首发特惠仅需5.99元

了解详情
不再提醒
插件&应用
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
高级版
套餐订阅购买积分包
AI 工具
文献检索文档翻译深度研究
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2025

基于大语言模型的聊天机器人与临床医生作为正畸学信息来源的可靠性:一项比较分析。

Reliability of Large Language Model-Based Chatbots Versus Clinicians as Sources of Information on Orthodontics: A Comparative Analysis.

作者信息

Martina Stefano, Cannatà Davide, Paduano Teresa, Schettino Valentina, Giordano Francesco, Galdi Marzio

机构信息

Department of Medicine, Surgery and Dentistry "Scuola Medica Salernitana", University of Salerno, Via Allende, 84081 Baronissi, Italy.

出版信息

Dent J (Basel). 2025 Jul 24;13(8):343. doi: 10.3390/dj13080343.


DOI:10.3390/dj13080343
PMID:40863046
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12385111/
Abstract

: The present cross-sectional analysis aimed to investigate whether Large Language Model-based chatbots can be used as reliable sources of information in orthodontics by evaluating chatbot responses and comparing them to those of dental practitioners with different levels of knowledge. : Eight true and false frequently asked orthodontic questions were submitted to five leading chatbots (ChatGPT-4, Claude-3-Opus, Gemini 2.0 Flash Experimental, Microsoft Copilot, and DeepSeek). The consistency of the answers given by chatbots at four different times was assessed using Cronbach's α. Chi-squared test was used to compare chatbot responses with those given by two groups of clinicians, i.e., general dental practitioners (GDPs) and orthodontic specialists (Os) recruited in an online survey via social media, and differences were considered significant when < 0.05. Additionally, chatbots were asked to provide a justification for their dichotomous responses using a chain-of-through prompting approach and rating the educational value according to the Global Quality Scale (GQS). : A high degree of consistency in answering was found for all analyzed chatbots (α > 0.80). When comparing chatbot answers with GDP and O ones, statistically significant differences were found for almost all the questions ( < 0.05). When evaluating the educational value of chatbot responses, DeepSeek achieved the highest GQS score (median 4.00; interquartile range 0.00), whereas CoPilot had the lowest one (median 2.00; interquartile range 2.00). : Although chatbots yield somewhat useful information about orthodontics, they can provide misleading information when dealing with controversial topics.

摘要

本横断面分析旨在通过评估基于大语言模型的聊天机器人的回答,并将其与不同知识水平的牙科从业者的回答进行比较,来研究这些聊天机器人是否可作为正畸学中可靠的信息来源。向五个领先的聊天机器人(ChatGPT-4、Claude-3-Opus、Gemini 2.0 Flash Experimental、Microsoft Copilot和DeepSeek)提交了八个正畸常见的是非问题。使用克朗巴哈α系数评估聊天机器人在四个不同时间给出答案的一致性。卡方检验用于比较聊天机器人的回答与通过社交媒体在线调查招募的两组临床医生(即普通牙科从业者(GDPs)和正畸专科医生(Os))的回答,当<0.05时差异被认为具有统计学意义。此外,要求聊天机器人使用推理提示方法为其二分法回答提供理由,并根据全球质量量表(GQS)对教育价值进行评分。所有分析的聊天机器人在回答方面都表现出高度一致性(α>0.80)。将聊天机器人的答案与GDP和Os的答案进行比较时,几乎所有问题都发现了统计学上的显著差异(<0.05)。在评估聊天机器人回答的教育价值时,DeepSeek获得了最高的GQS分数(中位数4.00;四分位间距0.00),而Copilot的分数最低(中位数2.00;四分位间距2.00)。虽然聊天机器人能提供一些有关正畸学的有用信息,但在处理有争议的话题时,它们可能会提供误导性信息。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eff2/12385111/31c083170b69/dentistry-13-00343-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eff2/12385111/b528a089457d/dentistry-13-00343-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eff2/12385111/31c083170b69/dentistry-13-00343-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eff2/12385111/b528a089457d/dentistry-13-00343-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eff2/12385111/31c083170b69/dentistry-13-00343-g002.jpg

相似文献

[1]
Reliability of Large Language Model-Based Chatbots Versus Clinicians as Sources of Information on Orthodontics: A Comparative Analysis.

Dent J (Basel). 2025-7-24

[2]
Information from digital and human sources: A comparison of chatbot and clinician responses to orthodontic questions.

Am J Orthod Dentofacial Orthop. 2025-5-6

[3]
Accuracy and Reliability of Artificial Intelligence Chatbots as Public Information Sources in Implant Dentistry.

Int J Oral Maxillofac Implants. 2025-6-25

[4]
Prescription of Controlled Substances: Benefits and Risks

2025-1

[5]
Accuracy of ChatGPT-3.5, ChatGPT-4o, Copilot, Gemini, Claude, and Perplexity in advising on lumbosacral radicular pain against clinical practice guidelines: cross-sectional study.

Front Digit Health. 2025-6-27

[6]
Benchmarking AI Chatbots for Maternal Lactation Support: A Cross-Platform Evaluation of Quality, Readability, and Clinical Accuracy.

Healthcare (Basel). 2025-7-20

[7]
Sexual Harassment and Prevention Training

2025-1

[8]
Performance of 7 Artificial Intelligence Chatbots on Board-style Endodontic Questions.

J Endod. 2025-6-26

[9]
Parental Perception on Usage of AI Chatbot to Understand Paediatric Otorhinolaryngology Condition: A Survey.

Indian J Otolaryngol Head Neck Surg. 2025-5

[10]
Five advanced chatbots solving European Diploma in Radiology (EDiR) text-based questions: differences in performance and consistency.

Eur Radiol Exp. 2025-8-19

本文引用的文献

[1]
Prevalence of Signs and Symptoms of Temporomandibular Disorders and Their Association with Emotional Factors and Waking-State Oral Behaviors on University Students: A Cross-Sectional Study.

Healthcare (Basel). 2025-6-12

[2]
Artificial Intelligence in Aesthetic Medicine: Applications, Challenges, and Future Directions.

J Cosmet Dermatol. 2025-6

[3]
Comparative analysis of AI chatbot (ChatGPT-4.0 and Microsoft Copilot) and expert responses to common orthodontic questions: patient and orthodontist evaluations.

BMC Oral Health. 2025-6-3

[4]
Performance of AI-Chatbots to Common Temporomandibular Joint Disorders (TMDs) Patient Queries: Accuracy, Completeness, Reliability and Readability.

Orthod Craniofac Res. 2025-5-7

[5]
Information from digital and human sources: A comparison of chatbot and clinician responses to orthodontic questions.

Am J Orthod Dentofacial Orthop. 2025-5-6

[6]
Parental Perceptions and Family Impact on Adolescents' Oral Health-Related Quality of Life in Relation to the Severity of Malocclusion and Caries Status.

Children (Basel). 2025-3-28

[7]
Evaluation of the performance of large language models in clinical decision-making in endodontics.

BMC Oral Health. 2025-4-28

[8]
Readability, accuracy and appropriateness and quality of AI chatbot responses as a patient information source on root canal retreatment: A comparative assessment.

Int J Med Inform. 2025-9

[9]
Effectiveness and Adherence of Pharmacological vs. Non-Pharmacological Technology-Supported Smoking Cessation Interventions: An Umbrella Review.

Healthcare (Basel). 2025-4-21

[10]
Artificial intelligence (AI) in restorative dentistry: current trends and future prospects.

BMC Oral Health. 2025-4-18

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

推荐工具

医学文档翻译智能文献检索