Key Laboratory of the Ministry of Education for Optoelectronic Measurement Technology and Instrument, Beijing Information Science and Technology University, Beijing 100192, China.
Beijing Laboratory of Biomedical Testing Technology and Instruments, Beijing Information Science and Technology University, Beijing 100192, China.
Biosensors (Basel). 2022 Jul 20;12(7):542. doi: 10.3390/bios12070542.
Automatic and accurate optical coherence tomography (OCT) image classification is of great significance to computer-assisted diagnosis of retinal disease. In this study, we propose a hybrid ConvNet-Transformer network (HCTNet) and verify the feasibility of a Transformer-based method for retinal OCT image classification. The HCTNet first utilizes a low-level feature extraction module based on the residual dense block to generate low-level features for facilitating the network training. Then, two parallel branches of the Transformer and the ConvNet are designed to exploit the global and local context of the OCT images. Finally, a feature fusion module based on an adaptive re-weighting mechanism is employed to combine the extracted global and local features for predicting the category of OCT images in the testing datasets. The HCTNet combines the advantage of the convolutional neural network in extracting local features and the advantage of the vision Transformer in establishing long-range dependencies. A verification on two public retinal OCT datasets shows that our HCTNet method achieves an overall accuracy of 91.56% and 86.18%, respectively, outperforming the pure ViT and several ConvNet-based classification methods.
自动且准确的光学相干断层扫描(OCT)图像分类对于视网膜疾病的计算机辅助诊断具有重要意义。在本研究中,我们提出了一种混合卷积神经网络-Transformer 网络(HCTNet),并验证了基于 Transformer 的方法在视网膜 OCT 图像分类中的可行性。HCTNet 首先利用基于残差密集块的底层特征提取模块生成底层特征,以促进网络训练。然后,设计了两个并行分支的 Transformer 和 ConvNet,以利用 OCT 图像的全局和局部上下文。最后,采用基于自适应重加权机制的特征融合模块,融合提取的全局和局部特征,以预测测试数据集中 OCT 图像的类别。HCTNet 结合了卷积神经网络在提取局部特征方面的优势和视觉 Transformer 在建立长程依赖关系方面的优势。在两个公共的视网膜 OCT 数据集上的验证表明,我们的 HCTNet 方法分别实现了 91.56%和 86.18%的整体准确率,优于纯 ViT 和几种基于 ConvNet 的分类方法。