Suppr超能文献

通过对抗性判别模态蒸馏进行带特权信息的学习。

Learning with Privileged Information via Adversarial Discriminative Modality Distillation.

出版信息

IEEE Trans Pattern Anal Mach Intell. 2020 Oct;42(10):2581-2593. doi: 10.1109/TPAMI.2019.2929038. Epub 2019 Jul 16.

Abstract

Heterogeneous data modalities can provide complementary cues for several tasks, usually leading to more robust algorithms and better performance. However, while training data can be accurately collected to include a variety of sensory modalities, it is often the case that not all of them are available in real life (testing) scenarios, where a model has to be deployed. This raises the challenge of how to extract information from multimodal data in the training stage, in a form that can be exploited at test time, considering limitations such as noisy or missing modalities. This paper presents a new approach in this direction for RGB-D vision tasks, developed within the adversarial learning and privileged information frameworks. We consider the practical case of learning representations from depth and RGB videos, while relying only on RGB data at test time. We propose a new approach to train a hallucination network that learns to distill depth information via adversarial learning, resulting in a clean approach without several losses to balance or hyperparameters. We report state-of-the-art results for object classification on the NYUD dataset, and video action recognition on the largest multimodal dataset available for this task, the NTU RGB+D, as well as on the Northwestern-UCLA.

摘要

异构数据模态可为多个任务提供补充线索,通常可以得到更稳健的算法和更好的性能。然而,尽管可以准确地收集训练数据以包含各种感觉模态,但在实际(测试)场景中,并非所有模态都可用,模型必须在这种场景中部署。这就提出了一个挑战,即如何在训练阶段以可在测试时利用的形式从多模态数据中提取信息,同时考虑到存在模态噪声或缺失的限制。本文针对 RGB-D 视觉任务提出了一种新的方法,该方法基于对抗学习和特权信息框架。我们考虑了从深度和 RGB 视频学习表示的实际情况,而仅在测试时依赖 RGB 数据。我们提出了一种新的训练幻觉网络的方法,该方法通过对抗学习学习提取深度信息,从而实现了一种无需平衡多个损失或超参数的简洁方法。我们在 NYUD 数据集上的对象分类和针对该任务可用的最大多模态数据集 NTU RGB+D 上的视频动作识别,以及 Northwestern-UCLA 上报告了最先进的结果。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验