Broad Andrew, Wright Alexander, McGenity Clare, Treanor Darren, de Kamps Marc
School of Computing, University of Leeds, Leeds, UK.
Leeds Institute for Data Analytics, University of Leeds, Leeds, UK.
Sci Rep. 2024 Dec 5;14(1):30400. doi: 10.1038/s41598-024-80717-3.
Human visual attention allows prior knowledge or expectations to influence visual processing, allocating limited computational resources to only that part of the image that are likely to behaviourally important. Here, we present an image recognition system based on biological vision that guides attention to more informative locations within a larger parent image, using a sequence of saccade-like motions. We demonstrate that at the end of the saccade sequence the system has an improved classification ability compared to the convolutional neural network (CNN) that represents the feedforward part of the model. Feedback activations highlight salient image features supporting the explainability of the classification. Our attention model deviates substantially from more common feedforward attention mechanisms, which linearly reweight part of the input. This model uses several passes of feedforward and backward activation, which interact non-linearly. We apply our feedback architecture to histopathology patch images, demonstrating a 3.5% improvement in accuracy (p < 0.001) when retrospectively processing 59,057 9-class patches from 689 colorectal cancer WSIs. In the saccade implementation, overall agreement between expert-labelled patches and model prediction reached 93.23% for tumour tissue, surpassing inter-pathologist agreement. Our method is adaptable to other areas of science which rely on the analysis of extremely large-scale images.
人类视觉注意力使先验知识或预期能够影响视觉处理,将有限的计算资源仅分配给图像中可能在行为上具有重要意义的部分。在此,我们提出一种基于生物视觉的图像识别系统,该系统使用一系列类似扫视的运动,将注意力引导至更大父图像中信息更丰富的位置。我们证明,与代表模型前馈部分的卷积神经网络(CNN)相比,在扫视序列结束时,该系统具有更高的分类能力。反馈激活突出了支持分类可解释性的显著图像特征。我们的注意力模型与更常见的前馈注意力机制有很大不同,后者对输入的一部分进行线性重新加权。该模型使用多次前馈和反向激活,它们以非线性方式相互作用。我们将我们的反馈架构应用于组织病理学切片图像,在对来自689个结直肠癌全切片图像的59,057个9类切片进行回顾性处理时,准确率提高了3.5%(p < 0.001)。在扫视实现中,专家标记切片与模型预测之间对于肿瘤组织的总体一致性达到93.23%,超过了病理学家之间的一致性。我们的方法适用于依赖超大规模图像分析的其他科学领域。