Suppr超能文献

基于能量的对比表示转移增强不平衡数据学习

Supercharging Imbalanced Data Learning With Energy-based Contrastive Representation Transfer.

作者信息

Chen Junya, Xiu Zidi, Goldstein Benjamin A, Henao Ricardo, Carin Lawrence, Tao Chenyang

机构信息

Duke University.

出版信息

Adv Neural Inf Process Syst. 2021 Dec;34:21229-21243.

Abstract

Dealing with severe class imbalance poses a major challenge for many real-world applications, especially when the accurate classification and generalization of minority classes are of primary interest. In computer vision and NLP, learning from datasets with long-tail behavior is a recurring theme, especially for naturally occurring labels. Existing solutions mostly appeal to sampling or weighting adjustments to alleviate the extreme imbalance, or impose inductive bias to prioritize generalizable associations. Here we take a novel perspective to promote sample efficiency and model generalization based on the invariance principles of causality. Our contribution posits a meta-distributional scenario, where the causal generating mechanism for label-conditional features is invariant across different labels. Such causal assumption enables efficient knowledge transfer from the dominant classes to their under-represented counterparts, even if their feature distributions show apparent disparities. This allows us to leverage a causal data augmentation procedure to enlarge the representation of minority classes. Our development is orthogonal to the existing imbalanced data learning techniques thus can be seamlessly integrated. The proposed approach is validated on an extensive set of synthetic and real-world tasks against state-of-the-art solutions.

摘要

处理严重的类别不平衡问题对许多实际应用构成了重大挑战,尤其是当少数类别的准确分类和泛化是主要关注点时。在计算机视觉和自然语言处理中,从具有长尾行为的数据集中学习是一个反复出现的主题,特别是对于自然出现的标签。现有的解决方案大多诉诸于采样或权重调整来缓解极端不平衡,或者施加归纳偏差以优先考虑可泛化的关联。在这里,我们从一个新颖的角度出发,基于因果关系的不变性原则来提高样本效率和模型泛化能力。我们的贡献提出了一种元分布场景,其中标签条件特征的因果生成机制在不同标签之间是不变的。这种因果假设能够实现从主导类别到其代表性不足的对应类别的有效知识转移,即使它们的特征分布存在明显差异。这使我们能够利用因果数据增强过程来扩大少数类别的表示。我们的方法与现有的不平衡数据学习技术正交,因此可以无缝集成。所提出的方法在一系列广泛的合成和实际任务中针对最先进的解决方案进行了验证。

相似文献

2
Probabilistic Contrastive Learning for Long-Tailed Visual Recognition.用于长尾视觉识别的概率对比学习
IEEE Trans Pattern Anal Mach Intell. 2024 Sep;46(9):5890-5904. doi: 10.1109/TPAMI.2024.3369102. Epub 2024 Aug 6.
6
Class-imbalanced complementary-label learning via weighted loss.基于加权损失的类别不平衡互补标签学习。
Neural Netw. 2023 Sep;166:555-565. doi: 10.1016/j.neunet.2023.07.030. Epub 2023 Jul 28.
7
PatchMix Augmentation to Identify Causal Features in Few-Shot Learning.利用 PatchMix 增强在小样本学习中识别因果特征。
IEEE Trans Pattern Anal Mach Intell. 2023 Jun;45(6):7639-7653. doi: 10.1109/TPAMI.2022.3223784. Epub 2023 May 5.

引用本文的文献

1

本文引用的文献

1
Open Long-Tailed Recognition in a Dynamic World.动态世界中的开放长尾识别
IEEE Trans Pattern Anal Mach Intell. 2024 Mar;46(3):1836-1851. doi: 10.1109/TPAMI.2022.3200091. Epub 2024 Feb 6.
2
Variational Learning of Individual Survival Distributions.个体生存分布的变分学习
Proc ACM Conf Health Inference Learn (2020). 2020 Apr;2020:10-18. doi: 10.1145/3368555.3384454. Epub 2020 Apr 2.
3
4
Text Data Augmentation for Deep Learning.用于深度学习的文本数据增强
J Big Data. 2021;8(1):101. doi: 10.1186/s40537-021-00492-0. Epub 2021 Jul 19.
5
Rare and extreme events: the case of COVID-19 pandemic.罕见和极端事件:以新冠疫情为例。
Nonlinear Dyn. 2020;100(3):2953-2972. doi: 10.1007/s11071-020-05680-w. Epub 2020 May 16.
6
Orthogonal Deep Neural Networks.正交深度神经网络。
IEEE Trans Pattern Anal Mach Intell. 2021 Apr;43(4):1352-1368. doi: 10.1109/TPAMI.2019.2948352. Epub 2021 Mar 4.
7
Focal Loss for Dense Object Detection.用于密集目标检测的焦散损失
IEEE Trans Pattern Anal Mach Intell. 2020 Feb;42(2):318-327. doi: 10.1109/TPAMI.2018.2858826. Epub 2018 Jul 23.
9
Credit Card Fraud Detection: A Realistic Modeling and a Novel Learning Strategy.信用卡欺诈检测:一种现实的建模与一种新颖的学习策略。
IEEE Trans Neural Netw Learn Syst. 2018 Aug;29(8):3784-3797. doi: 10.1109/TNNLS.2017.2736643. Epub 2017 Sep 14.
10

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验