UTRAD：基于 U-Transformer 的异常检测和定位。

UTRAD: Anomaly detection and localization with U-Transformer.

机构信息

School of Mechanical Engineering, Shanghai Jiao Tong University, Shanghai 200240, China.

University of the District of Columbia Department of Electrical and Computer Engineering 4200 Connecticut Avenue, NW Washington, D.C., 20008, USA.

出版信息

Neural Netw. 2022 Mar;147:53-62. doi: 10.1016/j.neunet.2021.12.008. Epub 2021 Dec 21.

DOI:10.1016/j.neunet.2021.12.008

PMID:34973607

Abstract

Anomaly detection is an active research field in industrial defect detection and medical disease detection. However, previous anomaly detection works suffer from unstable training, or non-universal criteria of evaluating feature distribution. In this paper, we introduce UTRAD, a U-TRansformer based Anomaly Detection framework. Deep pre-trained features are regarded as dispersed word tokens, and represented with transformer-based autoencoders. With reconstruction on more informative feature distribution instead of raw images, we achieve a more stable training process and a more precise anomaly detection and localization result. In addition, our proposed UTRAD has a multi-scale pyramidal hierarchy with skip connections that help detect both multi-scale structural and non-structural anomalies. As attention layers are decomposed to multi-level patches, UTRAD significantly reduces the computational cost and memory usage compared with the vanilla transformer. Experiments on industrial dataset MVtec AD and medical datasets Retinal-OCT, Brain-MRI, Head-CT have been conducted. Our proposed UTRAD out-performs all other state-of-the-art methods in the above datasets. Code released at https://github.com/gordon-chenmo/UTRAD.

摘要

异常检测是工业缺陷检测和医学疾病检测中的一个活跃研究领域。然而，以前的异常检测工作存在训练不稳定或特征分布评估标准不通用的问题。在本文中，我们引入了 UTRAD，一种基于 U-Transformer 的异常检测框架。深度预训练特征被视为离散的词元，并使用基于转换器的自动编码器进行表示。通过对更具信息量的特征分布进行重建，而不是原始图像，我们实现了更稳定的训练过程和更精确的异常检测和定位结果。此外，我们提出的 UTRAD 具有多尺度金字塔层次结构和跳过连接，可以帮助检测多尺度结构和非结构异常。由于注意力层被分解为多级补丁，与原始的转换器相比，UTRAD 显著降低了计算成本和内存使用。在工业数据集 MVtec AD 和医学数据集视网膜 OCT、脑 MRI、头 CT 上进行了实验。我们提出的 UTRAD 在上述数据集的所有其他最先进的方法中表现都更好。代码发布在 https://github.com/gordon-chenmo/UTRAD。