Suárez Ana, Freire Yolanda, Suárez María, Díaz-Flores García Víctor, Andreu-Vázquez Cristina, Thuissard Vasallo Israel John, Castillo Varón Ana Isabel, Martín Carmen
Department of Pre-Clinic Dentistry II, Faculty of Biomedical and Health Sciences, Universidad Europea de Madrid, Madrid, Spain.
Department of pre-Clinic Dentistry I, Faculty of Biomedical and Health Sciences, Universidad Europea de Madrid, Madrid, Spain.
Oral Dis. 2025 Jun 22. doi: 10.1111/odi.70009.
This study evaluated the accuracy and repeatability of ChatGPT-4o, a multimodal AI model, in interpreting photographs of oral mucosal lesions, and explored its potential as a diagnostic support tool for specialists and non-specialists.
Thirty clinical photographs of oral and labial mucosal lesions were analysed using ChatGPT-4o. For each image, 30 responses were generated across 20 days. The model was asked to identify the anatomical location, suggest a diagnosis, and recommend diagnostic tests and treatments. Two oral pathology experts assessed 3600 responses using a three-point scale (0 = incorrect, 1 = partially correct, 2 = correct). Accuracy and repeatability were analysed using accuracy rates, Gwet's AC and percent agreement.
ChatGPT-4o achieved 71.4% accuracy in identifying lesion location and 58.2% in diagnosis. In cases with correct diagnoses, the model reached 90.7% and 95.8% accuracy in suggesting diagnostic tests and treatments, respectively. Repeated responses showed substantial to almost perfect agreement across all evaluated aspects.
ChatGPT-4o showed potential as a reliable and accessible tool to support the initial assessment of oral lesions. Although not a substitute for clinical judgment, it may enhance diagnostic efficiency, particularly in resource-limited settings. Further validation is needed before clinical use.