掩蔽 Transformer 用于图像异常定位。

Masked Transformer for Image Anomaly Localization.

机构信息

Department of Mathematics, Computer Science and Physics, Università Degli Studi di Udine, via Delle, Scienze 206, 33100 Udine, Italy.

出版信息

Int J Neural Syst. 2022 Jul;32(7):2250030. doi: 10.1142/S0129065722500307. Epub 2022 Jun 21.

DOI:10.1142/S0129065722500307

PMID:35730477

Abstract

Image anomaly detection consists in detecting images or image portions that are visually different from the majority of the samples in a dataset. The task is of practical importance for various real-life applications like biomedical image analysis, visual inspection in industrial production, banking, traffic management, etc. Most of the current deep learning approaches rely on image reconstruction: the input image is projected in some latent space and then reconstructed, assuming that the network (mostly trained on normal data) will not be able to reconstruct the anomalous portions. However, this assumption does not always hold. We thus propose a new model based on the Vision Transformer architecture with patch masking: the input image is split in several patches, and each patch is reconstructed only from the surrounding data, thus ignoring the potentially anomalous information contained in the patch itself. We then show that multi-resolution patches and their collective embeddings provide a large improvement in the model's performance compared to the exclusive use of the traditional square patches. The proposed model has been tested on popular anomaly detection datasets such as MVTec and head CT and achieved good results when compared to other state-of-the-art approaches.

摘要

图像异常检测旨在检测在数据集中与大多数样本在视觉上不同的图像或图像部分。该任务在各种实际应用中具有重要意义，如生物医学图像分析、工业生产中的视觉检查、银行、交通管理等。目前大多数深度学习方法都依赖于图像重建：输入图像被投影到某个潜在空间中，然后进行重建，假设网络（主要在正常数据上进行训练）将无法重建异常部分。然而，这种假设并不总是成立的。因此，我们提出了一种基于 Vision Transformer 架构和补丁掩蔽的新模型：输入图像被分割成几个补丁，每个补丁仅从周围的数据进行重建，从而忽略了补丁本身可能包含的异常信息。然后我们表明，多分辨率补丁及其集体嵌入在模型性能方面提供了很大的改进，与传统的方形补丁的单独使用相比。所提出的模型已经在流行的异常检测数据集（如 MVTec 和头部 CT）上进行了测试，并与其他最先进的方法相比取得了良好的结果。