Suárez Ana, Arena Stefania, Herranz Calzada Alberto, Castillo Varón Ana Isabel, Diaz-Flores García Victor, Freire Yolanda
Department of Pre-Clinic Dentistry II, Faculty of Biomedical and Health Sciences, Universidad Europea de Madrid, Calle Tajo s/n, Villaviciosa de Odón, Madrid 28670, Spain.
Department of Pre-Clinic Dentistry I, Faculty of Biomedical and Health Sciences, Universidad Europea de Madrid, Calle Tajo s/n, Villaviciosa de Odón, Madrid 28670, Spain.
Comput Struct Biotechnol J. 2025 Apr 11;28:141-147. doi: 10.1016/j.csbj.2025.04.010. eCollection 2025.
The integration of Artificial Intelligence (AI) into healthcare has opened new avenues for clinical decision support, particularly in radiology. The aim of this study was to evaluate the accuracy and reproducibility of ChatGPT-4o in the radiographic image interpretation of orthopantomograms (OPGs) for assessment of lower third molars, simulating real patient requests for tooth extraction. Thirty OPGs were analyzed, each paired with a standardized prompt submitted to ChatGPT-4o, generating 900 responses (30 per radiograph). Two oral surgery experts independently evaluated the responses using a three-point Likert scale (correct, partially correct/incomplete, incorrect), with disagreements resolved by a third expert. ChatGPT-4o achieved an accuracy rate of 38.44 % (95 % CI: 35.27 %-41.62 %). The percentage agreement among repeated responses was 82.7 %, indicating high consistency, though Gwet's coefficient of agreement (60.4 %) suggested only moderate repeatability. While the model correctly identified general features in some cases, it frequently provided incomplete or fabricated information, particularly in complex radiographs involving overlapping structures or underdeveloped roots. These findings highlight ChatGPT-4o's current limitations in dental radiographic interpretation. Although it demonstrated some capability in analyzing OPGs, its accuracy and reliability remain insufficient for unsupervised clinical use. Professional oversight is essential to prevent diagnostic errors. Further refinement and specialized training of AI models are needed to enhance their performance and ensure safe integration into dental practice, especially in patient-facing applications.
将人工智能(AI)整合到医疗保健领域为临床决策支持开辟了新途径,尤其是在放射学方面。本研究的目的是评估ChatGPT-4o在全景曲面体层摄影(OPG)影像解读中评估下颌第三磨牙的准确性和可重复性,模拟真实患者的拔牙需求。分析了30张OPG,每张都与提交给ChatGPT-4o的标准化提示配对,共生成900条回复(每张X光片30条)。两位口腔外科专家使用三点李克特量表(正确、部分正确/不完整、错误)独立评估这些回复,如有分歧则由第三位专家解决。ChatGPT-4o的准确率为38.44%(95%置信区间:35.27%-41.62%)。重复回复之间的一致率为82.7%,表明一致性较高,不过格韦特一致性系数(60.4%)表明重复性仅为中等。虽然该模型在某些情况下能正确识别一般特征,但它经常提供不完整或编造的信息,尤其是在涉及重叠结构或牙根发育不全的复杂X光片中。这些发现凸显了ChatGPT-4o目前在牙科X光影像解读中的局限性。尽管它在分析OPG方面显示出一定能力,但其准确性和可靠性仍不足以用于无监督的临床应用。专业监督对于防止诊断错误至关重要。需要对人工智能模型进行进一步优化和专门训练,以提高其性能,并确保安全地整合到牙科实践中,尤其是在面向患者的应用中。
Comput Struct Biotechnol J. 2025-4-11
BMC Oral Health. 2025-3-29
Pract Radiat Oncol. 2025-4-29
Diagn Interv Radiol. 2025-4-28
Comput Struct Biotechnol J. 2023-12-6
J Dent Res. 2024-12
Commun Eng. 2024-9-17
J Med Internet Res. 2024-8-28
Dentomaxillofac Radiol. 2024-9-1
Neural Regen Res. 2025-2-1
Diagnostics (Basel). 2024-4-18
Eur Arch Otorhinolaryngol. 2024-6