Suppr超能文献

基于全景X线片的牙龄估计:正畸医生与ChatGPT-4使用伦敦图谱法、诺拉法和哈维科法的评估比较

Dental Age Estimation from Panoramic Radiographs: A Comparison of Orthodontist and ChatGPT-4 Evaluations Using the London Atlas, Nolla, and Haavikko Methods.

作者信息

Dursun Derya, Bilici Geçer Rumeysa

机构信息

Department of Orthodontics, Hamidiye Faculty of Dental Medicine, University of Health Sciences, Istanbul 34668, Turkey.

出版信息

Diagnostics (Basel). 2025 Sep 19;15(18):2389. doi: 10.3390/diagnostics15182389.

Abstract

Dental age (DA) estimation, which is widely used in orthodontics, pediatric dentistry, and forensic dentistry, predicts chronological age (CA) by assessing tooth development and maturation. Most methods rely on radiographic evaluation of tooth mineralization and eruption stages to assess DA. With the increasing adoption of large language models (LLMs) in medical sciences, use of ChatGPT has extended to processing visual data. The aim of this study, therefore, was to evaluate the performance of ChatGPT-4 in estimating DA from panoramic radiographs using three conventional methods (Nolla, Haavikko, and London Atlas) and to compare its accuracy against both orthodontist assessments and CA. In this retrospective study, panoramic radiographs of 511 Turkish children aged 6-17 years were assessed. DA was estimated using the Nolla, Haavikko, and London Atlas methods by both orthodontists and ChatGPT-4. The DA-CA difference and mean absolute error (MAE) were calculated, and statistical comparisons were performed to assess accuracy and sex differences and reach an agreement between the evaluators, with significance set at < 0.05. The mean CA of the study population was 12.37 ± 2.95 years (boys: 12.39 ± 2.94; girls: 12.35 ± 2.96). Using the London Atlas method, the orthodontists overestimated CA with a DA-CA difference of 0.78 ± 1.26 years ( < 0.001), whereas ChatGPT-4 showed no significant DA-CA difference (0.03 ± 0.93; = 0.399). Using the Nolla method, the orthodontist showed no significant DA-CA difference (0.03 ± 1.14; = 0.606), but ChatGPT-4 underestimated CA with a DA-CA difference of -0.40 ± 1.96 years ( < 0.001). Using the Haavikko method, the evaluators underestimated CA (orthodontist: -0.88; ChatGPT-4: -1.18; < 0.001). The lowest MAE for ChatGPT-4 was obtained when using the London Atlas method (0.59 ± 0.72), followed by Nolla (1.33 ± 1.28) and Haavikko (1.51 ± 1.41). For the orthodontists, the lowest MAE was achieved when using the Nolla method (0.86 ± 0.75). Agreement between the orthodontists and ChatGPT-4 was highest when using the London Atlas method (ICC = 0.944, r = 0.905). ChatGPT-4 showed the highest accuracy with the London Atlas method, with no significant difference from CA for either sex or the lowest prediction error. When using the Nolla and Haavikko methods, both ChatGPT-4 and the orthodontist tended to underestimate age, with higher errors. Overall, ChatGPT-4 performed best when using visually guided methods and was less accurate when using multi-stage scoring methods.

摘要

牙齿年龄(DA)估计在正畸学、儿童牙科和法医牙科中广泛应用,它通过评估牙齿的发育和成熟来预测实际年龄(CA)。大多数方法依靠对牙齿矿化和萌出阶段的影像学评估来评估DA。随着大语言模型(LLMs)在医学领域的日益普及,ChatGPT已被用于处理视觉数据。因此,本研究旨在使用三种传统方法(诺拉法、哈维科法和伦敦图谱法)评估ChatGPT-4从全景X线片估计DA的性能,并将其准确性与正畸医生的评估和实际年龄进行比较。在这项回顾性研究中,评估了511名6至17岁土耳其儿童的全景X线片。正畸医生和ChatGPT-4都使用诺拉法、哈维科法和伦敦图谱法来估计DA。计算DA与CA的差值和平均绝对误差(MAE),并进行统计比较以评估准确性和性别差异,以及评估者之间的一致性,显著性设定为<0.05。研究人群的平均实际年龄为12.37±2.95岁(男孩:12.39±2.94;女孩:12.35±2.96)。使用伦敦图谱法时,正畸医生高估了实际年龄,DA与CA的差值为0.78±1.26岁(<0.001),而ChatGPT-4显示DA与CA无显著差异(0.03±0.93;=0.399)。使用诺拉法时,正畸医生显示DA与CA无显著差异(0.03±1.14;=0.606),但ChatGPT-4低估了实际年龄,DA与CA的差值为-0.40±1.96岁(<0.001)。使用哈维科法时,评估者低估了实际年龄(正畸医生:-0.88;ChatGPT-4:-1.18;<0.001)。ChatGPT-4使用伦敦图谱法时获得的MAE最低(0.59±0.72),其次是诺拉法(1.33±1.28)和哈维科法(1.51±1.41)。对于正畸医生来说,使用诺拉法时MAE最低(0.86±0.75)。使用伦敦图谱法时,正畸医生和ChatGPT-4之间的一致性最高(组内相关系数=0.944,r=0.905)。ChatGPT-4使用伦敦图谱法时准确性最高,无论性别与实际年龄均无显著差异或预测误差最低。使用诺拉法和哈维科法时,ChatGPT-4和正畸医生都倾向于低估年龄,且误差更大。总体而言,ChatGPT-4在使用视觉引导方法时表现最佳,而在使用多阶段评分方法时准确性较低。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验