Department of Ophthalmology, Pusan National University College of Medicine, Busan, Korea.
Biomedical Research Institute, Pusan National University Hospital, Busan, Korea.
Sci Rep. 2020 Mar 19;10(1):5025. doi: 10.1038/s41598-020-62022-x.
Computer vision has greatly advanced recently. Since AlexNet was first introduced, many modified deep learning architectures have been developed and they are still evolving. However, there are few studies comparing these architectures in the field of ophthalmology. This study compared the performance of various state-of-the-art deep-learning architectures for detecting the optic nerve head and vertical cup-to-disc ratio in fundus images. Three different architectures were compared: YOLO V3, ResNet, and DenseNet. We compared various aspects of performance, which were not confined to the accuracy of detection but included, as well, the processing time, diagnostic performance, effect of the graphic processing unit (GPU), and image resolution. In general, as the input image resolution increased, the classification accuracy, localization error, and diagnostic performance all improved, but the optimal architecture differed depending on the resolution. The processing time was significantly accelerated with GPU assistance; even at the high resolution of 832 × 832, it was approximately 170 ms, which was at least 26 times slower without GPU. The choice of architecture may depend on the researcher's purpose when balancing between speed and accuracy. This study provides a guideline to determine deep learning architecture, optimal image resolution, and the appropriate hardware.
计算机视觉技术最近取得了重大进展。自从 AlexNet 首次问世以来,已经开发出了许多改进的深度学习架构,并且它们仍在不断发展。然而,在眼科领域,很少有研究比较这些架构。本研究比较了各种最先进的深度学习架构在检测眼底图像中的视神经头和垂直杯盘比方面的性能。比较了三种不同的架构:YOLO V3、ResNet 和 DenseNet。我们比较了性能的各个方面,不仅限于检测的准确性,还包括处理时间、诊断性能、图形处理单元 (GPU) 的影响以及图像分辨率。一般来说,随着输入图像分辨率的增加,分类准确性、定位误差和诊断性能都得到了提高,但最佳架构取决于分辨率。GPU 辅助显著加速了处理时间;即使在 832×832 的高分辨率下,处理时间也大约为 170ms,而没有 GPU 的处理时间至少慢 26 倍。架构的选择可能取决于研究人员在速度和准确性之间的权衡目的。本研究提供了一个确定深度学习架构、最佳图像分辨率和合适硬件的指导方针。