Cui Jiequan, Zhong Zhisheng, Tian Zhuotao, Liu Shu, Yu Bei, Jia Jiaya
IEEE Trans Pattern Anal Mach Intell. 2024 Dec;46(12):7463-7474. doi: 10.1109/TPAMI.2023.3278694. Epub 2024 Nov 6.
In this paper, we propose the Generalized Parametric Contrastive Learning (GPaCo/PaCo) which works well on both imbalanced and balanced data. Based on theoretical analysis, we observe supervised contrastive loss tends to bias on high-frequency classes and thus increases the difficulty of imbalanced learning. We introduce a set of parametric class-wise learnable centers to rebalance from an optimization perspective. Further, we analyze our GPaCo/PaCo loss under a balanced setting. Our analysis demonstrates that GPaCo/PaCo can adaptively enhance the intensity of pushing samples of the same class close as more samples are pulled together with their corresponding centers and benefit hard example learning. Experiments on long-tailed benchmarks manifest the new state-of-the-art for long-tailed recognition. On full ImageNet, models from CNNs to vision transformers trained with GPaCo loss show better generalization performance and stronger robustness compared with MAE models. Moreover, GPaCo can be applied to semantic segmentation task and obvious improvements are observed on 4 most popular benchmarks.
在本文中,我们提出了广义参数对比学习(GPaCo/PaCo),它在不平衡数据和平衡数据上都表现良好。基于理论分析,我们观察到有监督对比损失倾向于偏向高频类别,从而增加了不平衡学习的难度。我们引入了一组参数化的类级可学习中心,从优化的角度进行重新平衡。此外,我们在平衡设置下分析了我们的GPaCo/PaCo损失。我们的分析表明,随着更多样本与其相应中心聚集在一起,GPaCo/PaCo可以自适应地增强将同一类样本推近的强度,并有利于困难示例学习。在长尾基准上的实验表明了长尾识别的新的当前最优水平。在完整的ImageNet上,与MAE模型相比,使用GPaCo损失训练的从卷积神经网络到视觉Transformer的模型表现出更好的泛化性能和更强的鲁棒性。此外,GPaCo可以应用于语义分割任务,并且在4个最流行的基准上观察到了明显的改进。