IEEE Trans Neural Netw Learn Syst. 2022 Jun;33(6):2335-2349. doi: 10.1109/TNNLS.2021.3101403. Epub 2022 Jun 1.
This work focuses on image anomaly detection by leveraging only normal images in the training phase. Most previous methods tackle anomaly detection by reconstructing the input images with an autoencoder (AE)-based model, and an underlying assumption is that the reconstruction errors for the normal images are small, and those for the abnormal images are large. However, these AE-based methods, sometimes, even reconstruct the anomalies well; consequently, they are less sensitive to anomalies. To conquer this issue, we propose to reconstruct the image by leveraging the structure-texture correspondence. Specifically, we observe that, usually, for normal images, the texture can be inferred from its corresponding structure (e.g., the blood vessels in the fundus image and the structured anatomy in optical coherence tomography image), while it is hard to infer the texture from a destroyed structure for the abnormal images. Therefore, a structure-texture correspondence memory (STCM) module is proposed to reconstruct image texture from its structure, where a memory mechanism is used to characterize the mapping from the normal structure to its corresponding normal texture. As the correspondence between destroyed structure and texture cannot be characterized by the memory, the abnormal images would have a larger reconstruction error, facilitating anomaly detection. In this work, we utilize two kinds of complementary structures (i.e., the semantic structure with human-labeled category information and the low-level structure with abundant details), which are extracted by two structure extractors. The reconstructions from the two kinds of structures are fused together by a learned attention weight to get the final reconstructed image. We further feed the reconstructed image into the two aforementioned structure extractors to extract structures. On the one hand, constraining the consistency between the structures extracted from the original input and that from the reconstructed image would regularize the network training; on the other hand, the error between the structures extracted from the original input and that from the reconstructed image can also be used as a supplement measurement to identify the anomaly. Extensive experiments validate the effectiveness of our method for image anomaly detection on both industrial inspection images and medical images.
这项工作专注于仅在训练阶段使用正常图像进行图像异常检测。以前的大多数方法都是通过基于自动编码器 (AE) 的模型对输入图像进行重建来解决异常检测问题,其基本假设是正常图像的重建误差较小,而异常图像的重建误差较大。然而,这些基于 AE 的方法有时甚至可以很好地重建异常,因此它们对异常的敏感度较低。为了解决这个问题,我们建议利用图像的结构-纹理对应关系进行图像重建。具体来说,我们观察到,对于正常图像,通常可以从其对应的结构推断出纹理(例如眼底图像中的血管和光学相干断层扫描图像中的结构解剖),而对于异常图像,从破坏的结构推断出纹理则很困难。因此,提出了一种结构-纹理对应记忆(STCM)模块,用于从结构中重建图像纹理,其中使用记忆机制来表征从正常结构到其对应正常纹理的映射。由于记忆无法表征破坏结构与纹理之间的对应关系,异常图像的重建误差会更大,从而有助于异常检测。在这项工作中,我们利用了两种互补的结构(即具有人工标记类别信息的语义结构和具有丰富细节的低级结构),这两种结构是由两个结构提取器提取的。通过学习的注意力权重,将这两种结构的重建结果融合在一起,得到最终的重建图像。我们进一步将重建图像输入到前面提到的两个结构提取器中,以提取结构。一方面,约束从原始输入和重建图像中提取的结构之间的一致性可以正则化网络训练;另一方面,从原始输入和重建图像中提取的结构之间的误差也可以作为补充度量来识别异常。大量实验验证了我们的方法在工业检测图像和医学图像上进行图像异常检测的有效性。