Li Junjian, Kuang Hulin, Liu Jin, Yue Hailin, Wang Jianxin
IEEE J Biomed Health Inform. 2025 Jul;29(7):5095-5108. doi: 10.1109/JBHI.2025.3552640.
Pathological diagnosis assists in saving human lives, but such models are annotation hungry and pathological images are notably expensive to annotate. Contrastive learning could be a promising solution that relies only on the unlabeled training data to generate informative representations. However, the majority of current methods in contrastive learning have the following two issues: (1) positive samples produced through random augmentation are less challenging, and (2) false negative pairs problem caused by negative sampling bias. To alleviate the above issues, we propose a novel contrastive learning method called Cluster-Aware Adversarial Contrastive Learning (CACL). Specifically, a mixed data augmentation technique is provided to learn more transferable representations by generating more discriminative sample pairs. Furthermore, to mitigate the effects of inherent false negative pairs, we adopt a cluster-aware loss to identify similarities between instances and incorporate them into the process of contrastive learning. Finally, we generate challenging contrastive data pairs by adversarial learning, and adversarially learn robust representations in the representation space without the labeled training data, which aims to maximize the similarity between the augmented sample and the related adversarial sample. Our proposed CACL is evaluated on two public datasets: NCT-CRC-HE and PCam for the fine-tuning and linear evaluation tasks and on two other public datasets: GlaS and CARG for the detection and segmentation tasks, respectively. Extensive experimental results demonstrate the superior performance improvement of our method over several Self-supervised learning (SSL) methods and ImageNet pretraining particularly in scenarios with limited data availability for all four tasks.
病理诊断有助于挽救生命,但此类模型对标注的需求很大,且病理图像的标注成本极高。对比学习可能是一种很有前景的解决方案,它仅依赖未标注的训练数据来生成信息丰富的表征。然而,当前对比学习中的大多数方法存在以下两个问题:(1)通过随机增强生成的正样本挑战性不足;(2)由负采样偏差导致的假阴性对问题。为缓解上述问题,我们提出了一种名为聚类感知对抗对比学习(CACL)的新型对比学习方法。具体而言,我们提供了一种混合数据增强技术,通过生成更具判别力的样本对来学习更具可迁移性的表征。此外,为减轻固有假阴性对的影响,我们采用聚类感知损失来识别实例之间的相似性,并将其纳入对比学习过程。最后,我们通过对抗学习生成具有挑战性的对比数据对,并在没有标注训练数据的表征空间中对抗学习鲁棒的表征,其目的是最大化增强样本与相关对抗样本之间的相似性。我们提出的CACL在两个公共数据集上进行了评估:用于微调及线性评估任务的NCT-CRC-HE和PCam,以及分别用于检测和分割任务的另外两个公共数据集:GlaS和CARG。大量实验结果表明,我们的方法在所有四项任务的数据可用性有限的场景中,相对于几种自监督学习(SSL)方法和ImageNet预训练具有显著的性能提升。