Patel Ashay, Tudosiu Petru-Daniel, Pinaya Walter H L, Graham Mark S, Adeleke Olusola, Cook Gary, Goh Vicky, Ourselin Sebastien, Cardoso M Jorge
King's College London.
IEEE Int Conf Comput Vis Workshops. 2023 Dec 25;2023:2394-2402. doi: 10.1109/ICCVW60793.2023.00254.
Anomaly detection and segmentation pose an important task across sectors ranging from medical imaging analysis to industry quality control. However, current unsupervised approaches require training data to not contain any anomalies, a requirement that can be especially challenging in many medical imaging scenarios. In this paper, we propose Iterative Latent Token Masking, a self-supervised framework derived from a robust statistics point of view, translating an iterative model fitting with M-estimators to the task of anomaly detection. In doing so, this allows the training of unsupervised methods on datasets heavily contaminated with anomalous images. Our method stems from prior work on using Transformers, combined with a Vector Quantized-Variational Autoencoder, for anomaly detection, a method with state-of-the-art performance when trained on normal (non-anomalous) data. More importantly, we utilise the token masking capabilities of Transformers to filter out suspected anomalous tokens from each sample's sequence in the training set in an iterative self-supervised process, thus overcoming the difficulties of highly anomalous training data. Our work also highlights shortfalls in current state-of-the-art self-supervised, self-trained and unsupervised models when faced with small proportions of anomalous training data. We evaluate our method on whole-body PET data in addition to showing its wider application in more common computer vision tasks such as the industrial MVTec Dataset. Using varying levels of anomalous training data, our method showcases a superior performance over several state-of-the-art models, drawing attention to the potential of this approach.
异常检测与分割是一项重要任务,涵盖从医学影像分析到工业质量控制等多个领域。然而,当前的无监督方法要求训练数据不包含任何异常,这一要求在许多医学影像场景中可能极具挑战性。在本文中,我们提出了迭代潜在令牌掩码(Iterative Latent Token Masking),这是一个从稳健统计角度衍生出的自监督框架,将基于M估计器的迭代模型拟合应用于异常检测任务。这样一来,就能够在严重受异常图像污染的数据集中训练无监督方法。我们的方法源于先前利用Transformer并结合矢量量化变分自编码器进行异常检测的工作,该方法在正常(非异常)数据上训练时具有领先的性能。更重要的是,我们利用Transformer的令牌掩码功能,在迭代自监督过程中从训练集中每个样本的序列中过滤出可疑的异常令牌,从而克服了高度异常训练数据带来的困难。我们的工作还凸显了当前最先进的自监督、自训练和无监督模型在面对少量异常训练数据时的不足之处。除了展示其在更常见的计算机视觉任务(如工业MVTec数据集)中的广泛应用外,我们还在全身PET数据上评估了我们的方法。通过使用不同程度的异常训练数据,我们的方法展示出优于多个最先进模型的性能,凸显了这种方法的潜力。