Won Yeo Kyoung, Kim Choong Han, Jeon Jooyoung, Cha Jiho, Lim Dong Hui
Department of Ophthalmology, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, Republic of Korea.
Moonsoul Graduate School of Future Strategy, KAIST, Daejeon, Republic of Korea.
Comput Biol Med. 2025 May;190:109976. doi: 10.1016/j.compbiomed.2025.109976. Epub 2025 Mar 18.
To develop three novel Vision Transformer (ViT) frameworks for the specific diagnosis of bacterial and fungal keratitis using different types of anterior segment images and compare their performances.
Retrospective study.
A ViT was used to classify bacterial and fungal keratitis. We integrated one or more ViTs by adding a vector or by using self-attention to combine different types of anterior segment images (broad-beam, slit-beam, and blue-light). We compared the area under the receiver operating characteristic curve (AUROC) and area under the precision-recall curve (AUPRC) of the models. Cross-validation was performed thrice, and there was no overlap between the validation sets. The training/validation set was divided in an 8:2 ratio based on the number of individuals.
A total of 283 broad-beam, 610 slit-beam, and 342 blue-light images were obtained from 79 patients. 62 (78 %) patients were assigned for training and 17 (22 %) for validation. The AUROC of ViT with broad-beam images was 0.72. The top AUROC score (0.93) was attained by combining the outputs from two ViT models utilizing self-attention, incorporating both broad-beam and slit-beam images. Similarly, the highest AUPRC score (0.93) was reached by fusing the outputs from three ViTs with self-attention, involving broad-beam, slit-beam, and blue-light images.
Despite the limited dataset, we validated ViT with self-attention to learn different types of images to improve recognition accuracy in diagnosing bacterial and fungal keratitis. ViT with self-attention has a meaningful effect on enhancing the diagnostic performance of bacterial and fungal keratitis by combining two or more types of anterior segment images.
开发三种新型视觉Transformer(ViT)框架,用于使用不同类型的眼前节图像对细菌性和真菌性角膜炎进行特异性诊断,并比较它们的性能。
回顾性研究。
使用ViT对细菌性和真菌性角膜炎进行分类。我们通过添加向量或使用自注意力机制来整合一个或多个ViT,以组合不同类型的眼前节图像(宽光束、裂隙光束和蓝光)。我们比较了模型的受试者工作特征曲线下面积(AUROC)和精确召回率曲线下面积(AUPRC)。进行了三次交叉验证,验证集之间没有重叠。训练/验证集根据个体数量按8:2的比例划分。
从79例患者中总共获得了283张宽光束图像、610张裂隙光束图像和342张蓝光图像。62例(78%)患者被分配用于训练,17例(22%)用于验证。使用宽光束图像的ViT的AUROC为0.72。通过利用自注意力机制组合两个ViT模型的输出,结合宽光束和裂隙光束图像,获得了最高的AUROC分数(0.93)。同样,通过融合三个具有自注意力机制的ViT的输出,涉及宽光束、裂隙光束和蓝光图像,达到了最高的AUPRC分数(0.93)。
尽管数据集有限,但我们验证了具有自注意力机制的ViT能够学习不同类型的图像,以提高诊断细菌性和真菌性角膜炎的识别准确率。具有自注意力机制的ViT通过组合两种或更多类型的眼前节图像,对提高细菌性和真菌性角膜炎的诊断性能具有显著作用。