Aghajani Amirhossein, Rajabi Mohammad Taher, Rafizadeh Seyed Mohsen, Zand Amin, Rezaei Majid, Shojaeinia Mohammad, Rahmanikhah Elham
Department of Oculo-Facial Plastic and Reconstructive Surgery, Farabi Eye Hospital, Tehran University of Medical Sciences, Qazvin Square, Tehran, Iran.
Department of Epidemiology and Biostatistics, School of Public Health, Tehran University of Medical Sciences, Tehran, Iran.
BMC Ophthalmol. 2025 Apr 1;25(1):162. doi: 10.1186/s12886-025-03988-y.
To compare two artificial intelligence (AI) models, residual neural networks ResNet-50 and ResNet-101, for screening thyroid eye disease (TED) using frontal face photographs, and to test these models under clinical conditions.
A total of 1601 face photographs were obtained. These photographs were preprocessed by cropping to a region centered around the eyes. For the deep learning process, photographs from 643 TED patients and 643 healthy individuals were used for training the ResNet models. Additionally, 81 photographs of TED patients and 74 of normal subjects were used as the validation dataset. Finally, 80 TED cases and 80 healthy subjects comprised the test dataset. For application tests under clinical conditions, data from 25 TED patients and 25 healthy individuals were utilized to evaluate the non-inferiority of the AI models, with general ophthalmologists and fellowships as the control group.
In the test set verification of the ResNet-50 AI model, the area under the receiver operating characteristic (ROC) curve (AUC), accuracy, sensitivity, and specificity were 0.94, 0.88, 0.64, and 0.92, respectively. For the ResNet-101 AI model, these metrics were 0.93, 0.84, 0.76, and 0.92, respectively. In the application tests under clinical conditions, to evaluate the non-inferiority of the ResNet-50 AI model, the AUC, accuracy, sensitivity, and specificity were 0.82, 0.82, 0.88, and 0.76, respectively. For the ResNet-101 AI model, these metrics were 0.91, 0.84, 0.92, and 0.76, respectively, with no statistically significant differences between the two models for any of the metrics (all p-values > 0.05).
Face image-based TED screening using ResNet-50 and ResNet-101 AI models shows acceptable accuracy, sensitivity, and specificity for distinguishing TED from healthy subjects.
比较两种人工智能(AI)模型,即残差神经网络ResNet-50和ResNet-101,用于通过正面面部照片筛查甲状腺眼病(TED),并在临床条件下测试这些模型。
共获取1601张面部照片。这些照片通过裁剪预处理至以眼睛为中心的区域。对于深度学习过程,来自643例TED患者和643名健康个体的照片用于训练ResNet模型。此外,81例TED患者的照片和74名正常受试者的照片用作验证数据集。最后,80例TED病例和80名健康受试者组成测试数据集。为了在临床条件下进行应用测试,利用25例TED患者和25名健康个体的数据来评估AI模型的非劣效性,以普通眼科医生和进修医生作为对照组。
在ResNet-50 AI模型的测试集验证中,受试者操作特征(ROC)曲线下面积(AUC)、准确率、敏感性和特异性分别为0.94、0.88、0.64和0.92。对于ResNet-101 AI模型,这些指标分别为0.93、0.84、0.76和0.92。在临床条件下的应用测试中,为了评估ResNet-50 AI模型的非劣效性,AUC、准确率、敏感性和特异性分别为0.82、0.82、0.88和0.76。对于ResNet-101 AI模型,这些指标分别为0.91、0.84、0.92和0.76,两种模型在任何指标上均无统计学显著差异(所有p值>0.05)。
使用ResNet-50和ResNet-101 AI模型基于面部图像的TED筛查在区分TED与健康受试者方面显示出可接受的准确率、敏感性和特异性。