Xu Mingyu, Wang Lisha, Wang Shengzhan, Zhou Yifan, Maimaiti Nuliqiman, Shi Xin, Gu Renshu, Jia Gangyong, Jiao Zicheng, Gao Hongyi, Xu Peifang, Ye Juan
Eye Center of Second Affiliated Hospital, School of Medicine, Zhejiang University, Hangzhou, Zhejiang, China.
Zhejiang Provincial Key Laboratory of Ophthalmology, Zhejiang Provincial Clinical Research Center for Eye Diseases, Zhejiang Provincial Engineering Institute on Eye Diseases, Hangzhou, Zhejiang, China.
Ophthalmol Sci. 2025 Jul 12;5(6):100883. doi: 10.1016/j.xops.2025.100883. eCollection 2025 Nov-Dec.
To develop and validate a multi-label, multi-disease, well-generalized, and interpretable screening system applied to the detection of common ocular anterior segment diseases based on ocular surface slit-lamp images.
A multicenter artificial intelligence diagnostic study.
A total of 1990 patients were randomly selected from 2 medical centers: the Second Affiliated Hospital of Zhejiang University and the Affiliated People's Hospital of Ningbo University, between November 2016 and March 2022.
The data set was retrospectively collected from 2 clinical centers and composed of 5132 anonymized slit-lamp images of 13 ocular anterior segment diseases. The screening system was trained and validated in the internal data set composing randomly selected phenotypes and was tested in both internal and external data sets with less trained or new phenotypes included. The performance of the model was further compared with ophthalmologists.
The diagnostic accuracy, precision, recall, sensitivity, specificity, F1 score, Matthews correlation coefficient, confusion matrix, and area under the receiver operating characteristics curve.
The multi-label multi-disease detection ability of the screening system was evaluated in 3 stepwise levels and reached the average accuracy of 0.969 and 0.923 in binary image-level anomaly detection, 0.940 and 0.827 in the 4-class region-level anomaly detection, and 0.972 and 0.911 in the 13-class lesion-level anomaly detection in the internal and external test data sets, respectively, showing comparable performance with the ophthalmologists. Furthermore, the screening system presented the average accuracy of 0.950 and 0.852 in internal and external test data sets in images of phenotypes that were less trained or untrained.
Our screening system showed excellent multi-label and multi-disease detection ability and generalization ability in identifying ocular anterior segment disease, regardless of the limited phenotypes in the training data set. Thus, the screening system is anticipated to offer easily available primary medical information for patients and assist ophthalmologists in clinical practice.
Proprietary or commercial disclosure may be found in the Footnotes and Disclosures at the end of this article.
开发并验证一种基于眼表裂隙灯图像的多标签、多疾病、泛化性良好且可解释的筛查系统,用于检测常见眼前节疾病。
一项多中心人工智能诊断研究。
2016年11月至2022年3月期间,从浙江大学医学院附属第二医院和宁波大学附属人民医院这两家医疗中心随机选取了1990名患者。
回顾性收集来自2个临床中心的数据集,该数据集由13种眼前节疾病的5132张匿名裂隙灯图像组成。筛查系统在由随机选择的表型组成的内部数据集中进行训练和验证,并在包含较少训练或新表型的内部和外部数据集中进行测试。将该模型的性能与眼科医生的进行进一步比较。
诊断准确性、精确率、召回率、灵敏度、特异度、F1分数、马修斯相关系数、混淆矩阵以及受试者操作特征曲线下面积。
筛查系统的多标签多疾病检测能力分3个逐步递进的层次进行评估,在内部和外部测试数据集中,二元图像级异常检测的平均准确率分别达到0.969和0.923,4类区域级异常检测的平均准确率分别为0.940和0.827,13类病变级异常检测的平均准确率分别为0.972和0.911,与眼科医生的表现相当。此外,在训练较少或未训练的表型图像的内部和外部测试数据集中,筛查系统的平均准确率分别为0.950和0.852。
我们的筛查系统在识别眼前节疾病方面表现出优异的多标签和多疾病检测能力以及泛化能力,尽管训练数据集中的表型有限。因此,该筛查系统有望为患者提供易于获取的初级医疗信息,并在临床实践中协助眼科医生。
本文末尾的脚注和披露中可能会有专有或商业披露信息。