动态世界中的开放长尾识别

Open Long-Tailed Recognition in a Dynamic World.

作者信息

Liu Ziwei, Miao Zhongqi, Zhan Xiaohang, Wang Jiayun, Gong Boqing, Yu Stella X

出版信息

IEEE Trans Pattern Anal Mach Intell. 2024 Mar;46(3):1836-1851. doi: 10.1109/TPAMI.2022.3200091. Epub 2024 Feb 6.

DOI:10.1109/TPAMI.2022.3200091

Abstract

Real world data often exhibits a long-tailed and open-ended (i.e., with unseen classes) distribution. A practical recognition system must balance between majority (head) and minority (tail) classes, generalize across the distribution, and acknowledge novelty upon the instances of unseen classes (open classes). We define Open Long-Tailed Recognition++ (OLTR++) as learning from such naturally distributed data and optimizing for the classification accuracy over a balanced test set which includes both known and open classes. OLTR++ handles imbalanced classification, few-shot learning, open-set recognition, and active learning in one integrated algorithm, whereas existing classification approaches often focus only on one or two aspects and deliver poorly over the entire spectrum. The key challenges are: 1) how to share visual knowledge between head and tail classes, 2) how to reduce confusion between tail and open classes, and 3) how to actively explore open classes with learned knowledge. Our algorithm, OLTR++, maps images to a feature space such that visual concepts can relate to each other through a memory association mechanism and a learned metric (dynamic meta-embedding) that both respects the closed world classification of seen classes and acknowledges the novelty of open classes. Additionally, we propose an active learning scheme based on visual memory, which learns to recognize open classes in a data-efficient manner for future expansions. On three large-scale open long-tailed datasets we curated from ImageNet (object-centric), Places (scene-centric), and MS1M (face-centric) data, as well as three standard benchmarks (CIFAR-10-LT, CIFAR-100-LT, and iNaturalist-18), our approach, as a unified framework, consistently demonstrates competitive performance. Notably, our approach also shows strong potential for the active exploration of open classes and the fairness analysis of minority groups.

摘要

现实世界的数据通常呈现出长尾和开放式（即存在未见类）的分布。一个实用的识别系统必须在多数（头部）类和少数（尾部）类之间取得平衡，在整个分布上进行泛化，并在未见类（开放类）的实例出现时识别出新的类别。我们将开放长尾识别++（OLTR++）定义为从这种自然分布的数据中学习，并针对包含已知类和开放类的平衡测试集优化分类准确率。OLTR++在一个集成算法中处理不平衡分类、少样本学习、开放集识别和主动学习，而现有的分类方法通常只关注一两个方面，在整个范围内表现不佳。关键挑战在于：1）如何在头部类和尾部类之间共享视觉知识，2）如何减少尾部类和开放类之间的混淆，3）如何利用所学知识主动探索开放类。我们的算法OLTR++将图像映射到一个特征空间，使得视觉概念可以通过一种记忆关联机制和一种学习度量（动态元嵌入）相互关联，这种机制既尊重所见类的封闭世界分类，又承认开放类的新颖性。此外，我们提出了一种基于视觉记忆的主动学习方案，该方案以数据高效的方式学习识别开放类以便未来扩展。在我们从ImageNet（以物体为中心）、Places（以场景为中心）和MS1M（以人脸为中心）数据中精心挑选的三个大规模开放长尾数据集以及三个标准基准（CIFAR-10-LT、CIFAR-100-LT和iNaturalist-18）上，我们的方法作为一个统一框架，始终展现出有竞争力的性能。值得注意的是，我们的方法在主动探索开放类和少数群体公平性分析方面也显示出强大的潜力。

相似文献

Open Long-Tailed Recognition in a Dynamic World.动态世界中的开放长尾识别

IEEE Trans Pattern Anal Mach Intell. 2024 Mar;46(3):1836-1851. doi: 10.1109/TPAMI.2022.3200091. Epub 2024 Feb 6.

A dual-branch model with inter- and intra-branch contrastive loss for long-tailed recognition.用于长尾识别的具有分支间和分支内对比损失的双分支模型。

Neural Netw. 2023 Nov;168:214-222. doi: 10.1016/j.neunet.2023.09.022. Epub 2023 Sep 21.

Open-set long-tailed recognition via orthogonal prototype learning and false rejection correction.通过正交原型学习和错误拒绝校正实现开放集长尾识别

Neural Netw. 2025 Jan;181:106789. doi: 10.1016/j.neunet.2024.106789. Epub 2024 Oct 11.

ResLT: Residual Learning for Long-Tailed Recognition.结果：用于长尾识别的残差学习。

IEEE Trans Pattern Anal Mach Intell. 2023 Mar;45(3):3695-3706. doi: 10.1109/TPAMI.2022.3174892. Epub 2023 Feb 3.

MBNM: Multi-branch network based on memory features for long-tailed medical image recognition.基于记忆特征的多分支网络用于长尾医学图像识别。

Comput Methods Programs Biomed. 2021 Nov;212:106448. doi: 10.1016/j.cmpb.2021.106448. Epub 2021 Oct 2.

Enhanced Long-Tailed Recognition With Contrastive CutMix Augmentation.基于对比CutMix增强的长尾识别优化

IEEE Trans Image Process. 2024;33:4215-4230. doi: 10.1109/TIP.2024.3425148. Epub 2024 Jul 22.

ChatDiff: A ChatGPT-based diffusion model for long-tailed classification.ChatDiff：一种基于ChatGPT的用于长尾分类的扩散模型。

Neural Netw. 2025 Jan;181:106794. doi: 10.1016/j.neunet.2024.106794. Epub 2024 Oct 15.

The Equalization Losses: Gradient-Driven Training for Long-tailed Object Recognition.

IEEE Trans Pattern Anal Mach Intell. 2023 Nov;45(11):13876-13892. doi: 10.1109/TPAMI.2023.3298433. Epub 2023 Oct 3.

Divide and Retain: A Dual-Phase Modeling for Long-Tailed Visual Recognition.分割与保留：用于长尾视觉识别的双阶段建模

IEEE Trans Neural Netw Learn Syst. 2024 Oct;35(10):13538-13549. doi: 10.1109/TNNLS.2023.3269907. Epub 2024 Oct 7.

A Long-Tailed Image Classification Method Based on Enhanced Contrastive Visual Language.基于增强对比视觉语言的长尾图像分类方法。

Sensors (Basel). 2023 Jul 26;23(15):6694. doi: 10.3390/s23156694.

引用本文的文献

The Road to Safety: A Review of Uncertainty and Applications to Autonomous Driving Perception.安全之路：不确定性综述及其在自动驾驶感知中的应用

Entropy (Basel). 2024 Jul 26;26(8):634. doi: 10.3390/e26080634.

Supercharging Imbalanced Data Learning With Energy-based Contrastive Representation Transfer.基于能量的对比表示转移增强不平衡数据学习

Adv Neural Inf Process Syst. 2021 Dec;34:21229-21243.

动态世界中的开放长尾识别

Open Long-Tailed Recognition in a Dynamic World.

作者信息

Liu Ziwei, Miao Zhongqi, Zhan Xiaohang, Wang Jiayun, Gong Boqing, Yu Stella X

出版信息

IEEE Trans Pattern Anal Mach Intell. 2024 Mar;46(3):1836-1851. doi: 10.1109/TPAMI.2022.3200091. Epub 2024 Feb 6.

DOI:10.1109/TPAMI.2022.3200091

PMID:35984801

Abstract

摘要

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

动态世界中的开放长尾识别

Open Long-Tailed Recognition in a Dynamic World.

作者信息

出版信息

相似文献

引用本文的文献

动态世界中的开放长尾识别

Open Long-Tailed Recognition in a Dynamic World.

作者信息

出版信息

相似文献

引用本文的文献