文献检索文档翻译深度研究
Suppr Zotero 插件Zotero 插件
邀请有礼套餐&价格历史记录

新学期,新优惠

限时优惠:9月1日-9月22日

30天高级会员仅需29元

1天体验卡首发特惠仅需5.99元

了解详情
不再提醒
插件&应用
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
高级版
套餐订阅购买积分包
AI 工具
文献检索文档翻译深度研究
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2025

解读智慧:评估ChatGPT在分析全景图像以进行第三磨牙评估时的准确性和可重复性。

Decoding wisdom: Evaluating ChatGPT's accuracy and reproducibility in analyzing orthopantomographic images for third molar assessment.

作者信息

Suárez Ana, Arena Stefania, Herranz Calzada Alberto, Castillo Varón Ana Isabel, Diaz-Flores García Victor, Freire Yolanda

机构信息

Department of Pre-Clinic Dentistry II, Faculty of Biomedical and Health Sciences, Universidad Europea de Madrid, Calle Tajo s/n, Villaviciosa de Odón, Madrid 28670, Spain.

Department of Pre-Clinic Dentistry I, Faculty of Biomedical and Health Sciences, Universidad Europea de Madrid, Calle Tajo s/n, Villaviciosa de Odón, Madrid 28670, Spain.

出版信息

Comput Struct Biotechnol J. 2025 Apr 11;28:141-147. doi: 10.1016/j.csbj.2025.04.010. eCollection 2025.


DOI:10.1016/j.csbj.2025.04.010
PMID:40271108
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12017887/
Abstract

The integration of Artificial Intelligence (AI) into healthcare has opened new avenues for clinical decision support, particularly in radiology. The aim of this study was to evaluate the accuracy and reproducibility of ChatGPT-4o in the radiographic image interpretation of orthopantomograms (OPGs) for assessment of lower third molars, simulating real patient requests for tooth extraction. Thirty OPGs were analyzed, each paired with a standardized prompt submitted to ChatGPT-4o, generating 900 responses (30 per radiograph). Two oral surgery experts independently evaluated the responses using a three-point Likert scale (correct, partially correct/incomplete, incorrect), with disagreements resolved by a third expert. ChatGPT-4o achieved an accuracy rate of 38.44 % (95 % CI: 35.27 %-41.62 %). The percentage agreement among repeated responses was 82.7 %, indicating high consistency, though Gwet's coefficient of agreement (60.4 %) suggested only moderate repeatability. While the model correctly identified general features in some cases, it frequently provided incomplete or fabricated information, particularly in complex radiographs involving overlapping structures or underdeveloped roots. These findings highlight ChatGPT-4o's current limitations in dental radiographic interpretation. Although it demonstrated some capability in analyzing OPGs, its accuracy and reliability remain insufficient for unsupervised clinical use. Professional oversight is essential to prevent diagnostic errors. Further refinement and specialized training of AI models are needed to enhance their performance and ensure safe integration into dental practice, especially in patient-facing applications.

摘要

将人工智能(AI)整合到医疗保健领域为临床决策支持开辟了新途径,尤其是在放射学方面。本研究的目的是评估ChatGPT-4o在全景曲面体层摄影(OPG)影像解读中评估下颌第三磨牙的准确性和可重复性,模拟真实患者的拔牙需求。分析了30张OPG,每张都与提交给ChatGPT-4o的标准化提示配对,共生成900条回复(每张X光片30条)。两位口腔外科专家使用三点李克特量表(正确、部分正确/不完整、错误)独立评估这些回复,如有分歧则由第三位专家解决。ChatGPT-4o的准确率为38.44%(95%置信区间:35.27%-41.62%)。重复回复之间的一致率为82.7%,表明一致性较高,不过格韦特一致性系数(60.4%)表明重复性仅为中等。虽然该模型在某些情况下能正确识别一般特征,但它经常提供不完整或编造的信息,尤其是在涉及重叠结构或牙根发育不全的复杂X光片中。这些发现凸显了ChatGPT-4o目前在牙科X光影像解读中的局限性。尽管它在分析OPG方面显示出一定能力,但其准确性和可靠性仍不足以用于无监督的临床应用。专业监督对于防止诊断错误至关重要。需要对人工智能模型进行进一步优化和专门训练,以提高其性能,并确保安全地整合到牙科实践中,尤其是在面向患者的应用中。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ddba/12017887/fbfe9b61316c/gr3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ddba/12017887/26787766c87e/ga1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ddba/12017887/5b51043e088f/gr1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ddba/12017887/14073cccb5f2/gr2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ddba/12017887/fbfe9b61316c/gr3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ddba/12017887/26787766c87e/ga1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ddba/12017887/5b51043e088f/gr1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ddba/12017887/14073cccb5f2/gr2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ddba/12017887/fbfe9b61316c/gr3.jpg

相似文献

[1]
Decoding wisdom: Evaluating ChatGPT's accuracy and reproducibility in analyzing orthopantomographic images for third molar assessment.

Comput Struct Biotechnol J. 2025-4-11

[2]
Comparing diagnostic skills in endodontic cases: dental students versus ChatGPT-4o.

BMC Oral Health. 2025-3-29

[3]
Accuracy of ChatGPT-4o in Text and Video Analysis of Laryngeal Malignant and Premalignant Diseases.

J Voice. 2025-3-26

[4]
Assessing ChatGPT for Clinical Decision-Making in Radiation Oncology, With Open-Ended Questions and Images.

Pract Radiat Oncol. 2025-4-29

[5]
A retrospective evaluation of the potential of ChatGPT in the accurate diagnosis of acute stroke.

Diagn Interv Radiol. 2025-4-28

[6]
Artificial intelligence-generated responses to frequently asked questions on coccydynia: Evaluating the accuracy and consistency of GPT-4o's performance.

Arch Rheumatol. 2025-3-17

[7]
Optimizing ChatGPT's Interpretation and Reporting of Delirium Assessment Outcomes: Exploratory Study.

JMIR Form Res. 2024-10-1

[8]
GPT-4o vs. Human Candidates: Performance Analysis in the Polish Final Dentistry Examination.

Cureus. 2024-9-6

[9]
Beyond the Scalpel: Assessing ChatGPT's potential as an auxiliary intelligent virtual assistant in oral surgery.

Comput Struct Biotechnol J. 2023-12-6

[10]
Integrating AI into clinical education: evaluating general practice trainees' proficiency in distinguishing AI-generated hallucinations and impacting factors.

BMC Med Educ. 2025-3-19

本文引用的文献

[1]
Limitations of panoramic radiographs in predicting mandibular wisdom tooth extraction and the potential of deep learning models to overcome them.

Sci Rep. 2024-12-28

[2]
Publicly Available Dental Image Datasets for Artificial Intelligence.

J Dent Res. 2024-12

[3]
Interactive computer-aided diagnosis on medical image using large language models.

Commun Eng. 2024-9-17

[4]
Current Status of ChatGPT Use in Medical Education: Potentials, Challenges, and Strategies.

J Med Internet Res. 2024-8-28

[5]
How well do large language model-based chatbots perform in oral and maxillofacial radiology?

Dentomaxillofac Radiol. 2024-9-1

[6]
From text to image: challenges in integrating vision into ChatGPT for medical image interpretation.

Neural Regen Res. 2025-2-1

[7]
Validation of the Quality Analysis of Medical Artificial Intelligence (QAMAI) tool: a new tool to assess the quality of health information provided by AI platforms.

Eur Arch Otorhinolaryngol. 2024-11

[8]
The Role of Large Language Models (LLMs) in Providing Triage for Maxillofacial Trauma Cases: A Preliminary Study.

Diagnostics (Basel). 2024-4-18

[9]
Performance of a commercially available Generative Pre-trained Transformer (GPT) in describing radiolucent lesions in panoramic radiographs and establishing differential diagnoses.

Clin Oral Investig. 2024-3-9

[10]
Synergizing ChatGPT and general AI for enhanced medical diagnostic processes in head and neck imaging.

Eur Arch Otorhinolaryngol. 2024-6

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

推荐工具

医学文档翻译智能文献检索