Wen Huajie, Zhao Jian, Xiang Shaohua, Lin Lin, Liu Chengjian, Wang Tao, An Lin, Liang Lixin, Huang Bingding
College of Big Data and Internet, Shenzhen Technology University, Shenzhen 518118, China; College of Applied Science, Shenzhen University, Shenzhen 518060, China.
College of Big Data and Internet, Shenzhen Technology University, Shenzhen 518118, China.
Comput Methods Programs Biomed. 2022 Jun;220:106832. doi: 10.1016/j.cmpb.2022.106832. Epub 2022 Apr 27.
A retina optical coherence tomography (OCT) image differs from a traditional image due to its significant speckle noise, irregularity, and inconspicuous features. A conventional deep learning architecture cannot effectively improve the classification accuracy, sensitivity, and specificity of OCT images, and noisy images are not conducive to further diagnosis. This paper proposes a novel lesion-localization convolution transformer (LLCT) method, which combines both convolution and self-attention to classify ophthalmic diseases more accurately and localize the lesions in retina OCT images.
A novel architecture design is accomplished through applying customized feature maps generated by convolutional neutral network (CNN) as the input sequence of self-attention network. This design takes advantages of CNN's extracting image features and transformer's consideration of global context and dynamic attention. Part of the model is backward propagated to calculate the gradient as a weight parameter, which is multiplied and summed with the global features generated by the forward propagation process to locate the lesion.
Extensive experiments show that our proposed design achieves improvement of about 7.6% in overall accuracy, 10.9% in overall sensitivity, and 9.2% in overall specificity compared with previous methods. And the lesions can be localized without the labeling data of lesion location in OCT images.
The results prove that our method significantly improves the performance and reduces the computation complexity in artificial intelligence assisted analysis of ophthalmic disease through OCT images.
Our method has a significance boost in ophthalmic disease classification and location via convolution transformer. This is applicable to assist ophthalmologists greatly..
视网膜光学相干断层扫描(OCT)图像与传统图像不同,因为它存在明显的斑点噪声、不规则性和不明显的特征。传统的深度学习架构无法有效提高OCT图像的分类准确率、灵敏度和特异性,并且有噪声的图像不利于进一步诊断。本文提出了一种新颖的病变定位卷积变压器(LLCT)方法,该方法结合了卷积和自注意力,以更准确地对眼科疾病进行分类并在视网膜OCT图像中定位病变。
通过将卷积神经网络(CNN)生成的定制特征图作为自注意力网络的输入序列来完成新颖的架构设计。这种设计利用了CNN提取图像特征的能力以及变压器对全局上下文和动态注意力的考虑。模型的一部分进行反向传播以计算梯度作为权重参数,该参数与正向传播过程生成的全局特征相乘并求和以定位病变。
大量实验表明,与以前的方法相比,我们提出的设计在总体准确率上提高了约7.6%,在总体灵敏度上提高了10.9%,在总体特异性上提高了9.2%。并且无需OCT图像中病变位置的标记数据即可定位病变。
结果证明,我们的方法在通过OCT图像进行眼科疾病的人工智能辅助分析中显著提高了性能并降低了计算复杂度。
我们的方法通过卷积变压器在眼科疾病分类和定位方面具有显著提升。这对极大地协助眼科医生具有适用性。