Chair for Computer Aided Medical Procedures (CAMP), Technical University of Munich, Boltzmannstr. 3, Garching, Germany.
Chair for Computer Aided Medical Procedures (CAMP), Technical University of Munich, Boltzmannstr. 3, Garching, Germany.
Med Image Anal. 2021 Apr;69:101952. doi: 10.1016/j.media.2020.101952. Epub 2021 Jan 2.
Deep unsupervised representation learning has recently led to new approaches in the field of Unsupervised Anomaly Detection (UAD) in brain MRI. The main principle behind these works is to learn a model of normal anatomy by learning to compress and recover healthy data. This allows to spot abnormal structures from erroneous recoveries of compressed, potentially anomalous samples. The concept is of great interest to the medical image analysis community as it i) relieves from the need of vast amounts of manually segmented training data-a necessity for and pitfall of current supervised Deep Learning-and ii) theoretically allows to detect arbitrary, even rare pathologies which supervised approaches might fail to find. To date, the experimental design of most works hinders a valid comparison, because i) they are evaluated against different datasets and different pathologies, ii) use different image resolutions and iii) different model architectures with varying complexity. The intent of this work is to establish comparability among recent methods by utilizing a single architecture, a single resolution and the same dataset(s). Besides providing a ranking of the methods, we also try to answer questions like i) how many healthy training subjects are needed to model normality and ii) if the reviewed approaches are also sensitive to domain shift. Further, we identify open challenges and provide suggestions for future community efforts and research directions.
深度无监督表示学习最近在脑 MRI 的无监督异常检测 (UAD) 领域带来了新的方法。这些工作背后的主要原理是通过学习压缩和恢复健康数据来学习正常解剖结构的模型。这使得可以从压缩的潜在异常样本的错误恢复中发现异常结构。这个概念对医学图像分析社区非常有吸引力,因为它:i)不需要大量手动分割的训练数据——这是当前监督深度学习的必要条件和陷阱;ii)理论上允许检测任意的、甚至罕见的病理,而监督方法可能无法发现这些病理。到目前为止,大多数工作的实验设计阻碍了有效的比较,因为:i)它们针对不同的数据集和不同的病理进行评估;ii)使用不同的图像分辨率;iii)不同的模型架构,具有不同的复杂性。这项工作的目的是通过使用单一的架构、单一的分辨率和相同的数据集来建立最近方法之间的可比性。除了提供方法的排名,我们还试图回答以下问题:i)需要多少个健康的训练对象来建模正常性;ii)所审查的方法是否对域转移也敏感。此外,我们确定了开放的挑战,并为未来的社区努力和研究方向提供了建议。