Son Ki Young, Ko Jongwoo, Kim Eunseok, Lee Si Young, Kim Min-Ji, Han Jisang, Shin Eunhae, Chung Tae-Young, Lim Dong Hui
Department of Ophthalmology, Sungkyunkwan University School of Medicine, Samsung Medical Center, Seoul, Republic of Korea.
Graduate School of Artificial Intelligence, Korea Advanced Institute of Science and Technology (KAIST), Daejeon, Republic of Korea.
Ophthalmol Sci. 2022 Mar 18;2(2):100147. doi: 10.1016/j.xops.2022.100147. eCollection 2022 Jun.
To develop and validate an automated deep learning (DL)-based artificial intelligence (AI) platform for diagnosing and grading cataracts using slit-lamp and retroillumination lens photographs based on the Lens Opacities Classification System (LOCS) III.
Cross-sectional study in which a convolutional neural network was trained and tested using photographs of slit-lamp and retroillumination lens photographs.
One thousand three hundred thirty-five slit-lamp images and 637 retroillumination lens images from 596 patients.
Slit-lamp and retroillumination lens photographs were graded by 2 trained graders using LOCS III. Image datasets were labeled and divided into training, validation, and test datasets. We trained and validated AI platforms with 4 key strategies in the AI domain: (1) region detection network for redundant information inside data, (2) data augmentation and transfer learning for the small dataset size problem, (3) generalized cross-entropy loss for dataset bias, and (4) class balanced loss for class imbalance problems. The performance of the AI platform was reinforced with an ensemble of 3 AI algorithms: ResNet18, WideResNet50-2, and ResNext50.
Diagnostic and LOCS III-based grading prediction performance of AI platforms.
The AI platform showed robust diagnostic performance (area under the receiver operating characteristic curve [AUC], 0.9992 [95% confidence interval (CI), 0.9986-0.9998] and 0.9994 [95% CI, 0.9989-0.9998]; accuracy, 98.82% [95% CI, 97.7%-99.9%] and 98.51% [95% CI, 97.4%-99.6%]) and LOCS III-based grading prediction performance (AUC, 0.9567 [95% CI, 0.9501-0.9633] and 0.9650 [95% CI, 0.9509-0.9792]; accuracy, 91.22% [95% CI, 89.4%-93.0%] and 90.26% [95% CI, 88.6%-91.9%]) for nuclear opalescence (NO) and nuclear color (NC) using slit-lamp photographs, respectively. For cortical opacity (CO) and posterior subcapsular opacity (PSC), the system achieved high diagnostic performance (AUC, 0.9680 [95% CI, 0.9579-0.9781] and 0.9465 [95% CI, 0.9348-0.9582]; accuracy, 96.21% [95% CI, 94.4%-98.0%] and 92.17% [95% CI, 88.6%-95.8%]) and good LOCS III-based grading prediction performance (AUC, 0.9044 [95% CI, 0.8958-0.9129] and 0.9174 [95% CI, 0.9055-0.9295]; accuracy, 91.33% [95% CI, 89.7%-93.0%] and 87.89% [95% CI, 85.6%-90.2%]) using retroillumination images.
Our DL-based AI platform successfully yielded accurate and precise detection and grading of NO and NC in 7-level classification and CO and PSC in 6-level classification, overcoming the limitations of medical databases such as few training data or biased label distribution.
开发并验证一个基于深度学习(DL)的人工智能(AI)平台,该平台使用裂隙灯和后照式晶状体照片,基于晶状体混浊分类系统(LOCS)III对白内障进行诊断和分级。
一项横断面研究,其中使用裂隙灯和后照式晶状体照片训练和测试卷积神经网络。
来自596名患者的1335张裂隙灯图像和637张后照式晶状体图像。
2名经过培训的分级人员使用LOCS III对裂隙灯和后照式晶状体照片进行分级。图像数据集被标记并分为训练、验证和测试数据集。我们在人工智能领域采用4种关键策略训练和验证了人工智能平台:(1)用于检测数据中冗余信息的区域检测网络;(2)针对数据集规模较小问题的数据增强和迁移学习;(3)用于解决数据集偏差的广义交叉熵损失;(4)用于解决类别不平衡问题的类别平衡损失。人工智能平台的性能通过3种人工智能算法(ResNet18、WideResNet50 - 2和ResNext50)的集成得到增强。
人工智能平台基于诊断和LOCS III的分级预测性能。
人工智能平台在使用裂隙灯照片诊断核混浊(NO)和核颜色(NC)时,表现出强大的诊断性能(受试者操作特征曲线下面积[AUC]分别为0.9992[95%置信区间(CI),0.9986 - 0.9998]和0.9994[95%CI,0.9989 - 0.9998];准确率分别为98.82%[95%CI,97.7% - 99.9%]和98.51%[95%CI,97.4% - 99.6%])以及基于LOCS III的分级预测性能(AUC分别为0.9567[95%CI,0.9501 - 0.9633]和0.9650[95%CI,0.9509 - 0.9792];准确率分别为91.22%[95%CI,89.4% - 93.0%]和90.26%[95%CI,88.6% - 91.9%])。对于皮质性混浊(CO)和后囊下混浊(PSC),该系统使用后照式图像时实现了较高的诊断性能(AUC分别为0.9680[95%CI,0.9579 - 0.9781]和0.9465[95%CI,0.9348 - 0.9582];准确率分别为96.21%[95%CI,94.4% - 98.0%]和92.17%[95%CI,88.6% - 95.8%])以及良好的基于LOCS III的分级预测性能(AUC分别为0.9044[95%CI,0.8958 - 0.9129]和0.9174[95%CI,0.9055 - 0.9295];准确率分别为91.33%[95%CI,89.7% - 93.0%]和87.89%[95%CI,85.6% - 90.2%])。
我们基于深度学习的人工智能平台成功地在7级分类中对NO和NC以及在6级分类中对CO和PSC进行了准确且精确的检测和分级,克服了医学数据库的局限性,如训练数据少或标签分布有偏差等问题。