Liu Chong, Wang Weiguang, Lian Jian, Jiao Wanzhen
School of Intelligence Engineering, Shandong Management University, Jinan, China.
Department of Ophthalmology, Shandong Provincial Hospital Affiliated to Shandong First Medical University, Jinan, China.
Front Public Health. 2025 Jan 6;12:1442114. doi: 10.3389/fpubh.2024.1442114. eCollection 2024.
Diabetic retinopathy grading plays a vital role in the diagnosis and treatment of patients. In practice, this task mainly relies on manual inspection using human visual system. However, the human visual system-based screening process is labor-intensive, time-consuming, and error-prone. Therefore, plenty of automated screening technique have been developed to address this task.
Among these techniques, the deep learning models have demonstrated promising outcomes in various types of machine vision tasks. However, most of the medical image analysis-oriented deep learning approaches are built upon the convolutional operations, which might neglect the global dependencies between long-range pixels in the medical images. Therefore, the vision transformer models, which can unveil the associations between global pixels, have been gradually employed in medical image analysis. However, the quadratic computation complexity of attention mechanism has hindered the deployment of vision transformer in clinical practices. Bearing the analysis above in mind, this study introduces an integrated self-attention mechanism with both softmax and linear modules to guarantee efficiency and expressiveness, simultaneously. To be specific, a portion of query and key tokens, which are much less than the original query and key tokens, are adopted in the attention module by adding a set of proxy tokens. Note that the proxy tokens can fully utilize both the advantages of softmax and linear attention.
To evaluate the performance of the presented approach, the comparison experiments between state-of-the-art algorithms and the proposed approach are conducted. Experimental results demonstrate that the proposed approach achieves superior outcome over the state-of-the-art algorithms on the publicly available datasets.
Accordingly, the proposed approach can be taken as a potentially valuable instrument in clinical practices.
糖尿病视网膜病变分级在患者的诊断和治疗中起着至关重要的作用。在实际操作中,这项任务主要依靠人工视觉系统进行人工检查。然而,基于人工视觉系统的筛查过程劳动强度大、耗时且容易出错。因此,已经开发了大量自动化筛查技术来处理这项任务。
在这些技术中,深度学习模型在各种类型的机器视觉任务中都取得了有前景的成果。然而,大多数面向医学图像分析的深度学习方法都是基于卷积操作构建的,这可能会忽略医学图像中远距离像素之间的全局依赖性。因此,能够揭示全局像素之间关联的视觉Transformer模型已逐渐应用于医学图像分析中。然而,注意力机制的二次计算复杂度阻碍了视觉Transformer在临床实践中的应用。考虑到上述分析,本研究引入了一种同时具有softmax和线性模块的集成自注意力机制,以确保效率和表现力。具体而言,通过添加一组代理令牌,在注意力模块中采用比原始查询和键令牌少得多的一部分查询和键令牌。请注意,代理令牌可以充分利用softmax和线性注意力的优点。
为了评估所提出方法的性能,进行了最先进算法与所提出方法之间的对比实验。实验结果表明,在所公开的数据集上,所提出的方法比最先进的算法取得了更好的结果。
因此,所提出的方法可以被视为临床实践中一种潜在有价值的工具。