IEEE J Biomed Health Inform. 2022 Sep;26(9):4599-4610. doi: 10.1109/JBHI.2022.3186882. Epub 2022 Sep 9.
Learning similarity is a key aspect in medical image analysis, particularly in recommendation systems or in uncovering the interpretation of anatomical data in images. Most existing methods learn such similarities in the embedding space over image sets using a single metric learner. Images, however, have a variety of object attributes such as color, shape, or artifacts. Encoding such attributes using a single metric learner is inadequate and may fail to generalize. Instead, multiple learners could focus on separate aspects of these attributes in subspaces of an overarching embedding. This, however, implies the number of learners to be found empirically for each new dataset. This work, Dynamic Subspace Learners, proposes to dynamically exploit multiple learners by removing the need of knowing apriori the number of learners and aggregating new subspace learners during training. Furthermore, the visual interpretability of such subspace learning is enforced by integrating an attention module into our method. This integrated attention mechanism provides a visual insight of discriminative image features that contribute to the clustering of image sets and a visual explanation of the embedding features. The benefits of our attention-based dynamic subspace learners are evaluated in the application of image clustering, image retrieval, and weakly supervised segmentation. Our method achieves competitive results with the performances of multiple learners baselines and significantly outperforms the classification network in terms of clustering and retrieval scores on three different public benchmark datasets. Moreover, our method also provides an attention map generated directly during inference to illustrate the visual interpretability of the embedding features. These attention maps offer a proxy-labels, which improves the segmentation accuracy up to 15% in Dice scores when compared to state-of-the-art interpretation techniques.
学习相似性是医学图像分析中的一个关键方面,特别是在推荐系统或揭示图像中解剖数据的解释方面。大多数现有的方法使用单个度量学习器在图像集合的嵌入空间中学习这些相似性。然而,图像具有多种对象属性,例如颜色、形状或伪影。使用单个度量学习器对这些属性进行编码是不够的,并且可能无法泛化。相反,多个学习者可以在总体嵌入的子空间中专注于这些属性的不同方面。然而,这意味着对于每个新数据集都需要通过经验找到数量的学习者。这项工作,动态子空间学习者,通过去除预先知道学习者数量的需求,并在训练期间聚合新的子空间学习者,提出了动态利用多个学习者的方法。此外,通过将注意力模块集成到我们的方法中,强制执行这种子空间学习的可视解释性。这种集成的注意力机制提供了对判别图像特征的直观理解,这些特征有助于图像集的聚类,并对嵌入特征进行可视化解释。我们的基于注意力的动态子空间学习者的优势在图像聚类、图像检索和弱监督分割的应用中进行了评估。我们的方法在三个不同的公共基准数据集上的聚类和检索得分方面取得了与多个学习者基线相当的性能,并且显著优于分类网络的性能。此外,我们的方法还在推理过程中直接生成注意力图,以说明嵌入特征的可视解释性。这些注意力图提供了代理标签,与最先进的解释技术相比,在骰子分数上的分割精度提高了 15%。