Singh Azad, Gorade Vandan, Mishra Deepak
IEEE J Biomed Health Inform. 2024 Dec;28(12):7480-7490. doi: 10.1109/JBHI.2024.3455337. Epub 2024 Dec 5.
Self-supervised learning (SSL) reduces the need for manual annotation in deep learning models for medical image analysis. By learning the representations from unablelled data, self-supervised models perform well on tasks that require little to no fine-tuning. However, for medical images, like chest X-rays, characterised by complex anatomical structures and diverse clinical conditions, a need arises for representation learning techniques that encode fine-grained details while preserving the broader contextual information. In this context, we introduce MLVICX (Multi-Level Variance-Covariance Exploration for Chest X-ray Self-Supervised Representation Learning), an approach to capture rich representations in the form of embeddings from chest X-ray images. Central to our approach is a novel multi-level variance and covariance exploration strategy that effectively enables the model to detect diagnostically meaningful patterns while reducing redundancy. MLVICX promotes the retention of critical medical insights by adapting global and local contextual details and enhancing the variance and covariance of the learned embeddings. We demonstrate the performance of MLVICX in advancing self-supervised chest X-ray representation learning through comprehensive experiments. The performance enhancements we observe across various downstream tasks highlight the significance of the proposed approach in enhancing the utility of chest X-ray embeddings for precision medical diagnosis and comprehensive image analysis. For pertaining, we used the NIH-Chest X-ray dataset. Downstream tasks utilized NIH-Chest X-ray, Vinbig-CXR, RSNA pneumonia, and SIIM-ACR Pneumothorax datasets. Overall, we observe up to 3% performance gain over SOTA SSL approaches in various downstream tasks. Additionally, to demonstrate generalizability of our method, we conducted additional experiments on fundus images and observed superior performance on multiple datasets. Codes are available at GitHub.
自监督学习(SSL)减少了医学图像分析深度学习模型中对人工标注的需求。通过从未标注数据中学习表示,自监督模型在几乎不需要微调或完全不需要微调的任务上表现出色。然而,对于像胸部X光这样具有复杂解剖结构和多样临床情况的医学图像,需要一种表示学习技术,既能编码细粒度细节,又能保留更广泛的上下文信息。在此背景下,我们引入了MLVICX(用于胸部X光自监督表示学习的多级方差协方差探索),一种从胸部X光图像中以嵌入形式捕获丰富表示的方法。我们方法的核心是一种新颖的多级方差和协方差探索策略,该策略有效地使模型能够检测具有诊断意义的模式,同时减少冗余。MLVICX通过适应全局和局部上下文细节以及增强学习到的嵌入的方差和协方差,促进关键医学见解的保留。我们通过全面实验展示了MLVICX在推进自监督胸部X光表示学习方面的性能。我们在各种下游任务中观察到的性能提升突出了所提出方法在提高胸部X光嵌入用于精准医学诊断和全面图像分析的效用方面的重要性。在相关研究中,我们使用了NIH胸部X光数据集。下游任务使用了NIH胸部X光、Vinbig-CXR、RSNA肺炎和SIIM-ACR气胸数据集。总体而言,我们在各种下游任务中观察到比当前最优的SSL方法性能提升高达3%。此外,为了证明我们方法的通用性,我们对眼底图像进行了额外实验,并在多个数据集上观察到了卓越的性能。代码可在GitHub上获取。