Taher Mohammad Reza Hosseinzadeh, Gotway Michael B, Liang Jianming
Arizona State University.
Mayo Clinic.
Proc IEEE Comput Soc Conf Comput Vis Pattern Recognit. 2024 Jun;abs/210504906(2024):11269-11281. doi: 10.1109/cvpr52733.2024.01071. Epub 2024 Sep 16.
Humans effortlessly interpret images by parsing them into part-whole hierarchies; deep learning excels in learning multi-level feature spaces, but they often lack explicit coding of part-whole relations, a prominent property of medical imaging. To overcome this limitation, we introduce Adam-v2, a new self-supervised learning framework extending Adam [79] by explicitly incorporating part-whole hierarchies into its learning objectives through three key branches: (1) Localizability, acquiring discriminative representations to distinguish different anatomical patterns; (2) Composability, learning each anatomical structure in a parts-to-whole manner; and (3) Decomposability, comprehending each anatomical structure in a whole-to-parts manner. Experimental results across 10 tasks, compared to 11 baselines in zero-shot, few-shot transfer, and full fine-tuning settings, showcase Adam-v2's superior performance over large-scale medical models and existing SSL methods across diverse downstream tasks. The higher generality and robustness of Adam-v2's representations originate from its explicit construction of hierarchies for distinct anatomical structures from unlabeled medical images. Adam-v2 preserves a semantic balance of anatomical diversity and harmony in its embedding, yielding representations that are both generic and semantically meaningful, yet overlooked in existing SSL methods. All code and pretrained models are available at GitHub.com/JLiangLab/Eden.
人类通过将图像解析为部分-整体层次结构来轻松地解释图像;深度学习擅长学习多级特征空间,但它们通常缺乏对部分-整体关系的显式编码,而这是医学成像的一个突出特性。为了克服这一局限性,我们引入了Adam-v2,这是一种新的自监督学习框架,它通过三个关键分支将部分-整体层次结构明确纳入其学习目标,从而扩展了Adam [79]:(1)可定位性,获取判别性表示以区分不同的解剖模式;(2)可组合性,以从部分到整体的方式学习每个解剖结构;(3)可分解性,以从整体到部分的方式理解每个解剖结构。在零样本、少样本迁移和全量微调设置下,与11个基线相比,在10个任务上的实验结果表明,Adam-v2在各种下游任务上的性能优于大规模医学模型和现有的自监督学习方法。Adam-v2表示的更高通用性和鲁棒性源于其从未标记医学图像中为不同解剖结构显式构建层次结构。Adam-v2在其嵌入中保持了解剖多样性和协调性的语义平衡,产生了既通用又具有语义意义的表示,但在现有的自监督学习方法中却被忽视了。所有代码和预训练模型可在GitHub.com/JLiangLab/Eden上获取。