Suppr超能文献

传导多视角零样本学习。

Transductive multi-view zero-shot learning.

出版信息

IEEE Trans Pattern Anal Mach Intell. 2015 Nov;37(11):2332-45. doi: 10.1109/TPAMI.2015.2408354.

Abstract

Most existing zero-shot learning approaches exploit transfer learning via an intermediate semantic representation shared between an annotated auxiliary dataset and a target dataset with different classes and no annotation. A projection from a low-level feature space to the semantic representation space is learned from the auxiliary dataset and applied without adaptation to the target dataset. In this paper we identify two inherent limitations with these approaches. First, due to having disjoint and potentially unrelated classes, the projection functions learned from the auxiliary dataset/domain are biased when applied directly to the target dataset/domain. We call this problem the projection domain shift problem and propose a novel framework, transductive multi-view embedding, to solve it. The second limitation is the prototype sparsity problem which refers to the fact that for each target class, only a single prototype is available for zero-shot learning given a semantic representation. To overcome this problem, a novel heterogeneous multi-view hypergraph label propagation method is formulated for zero-shot learning in the transductive embedding space. It effectively exploits the complementary information offered by different semantic representations and takes advantage of the manifold structures of multiple representation spaces in a coherent manner. We demonstrate through extensive experiments that the proposed approach (1) rectifies the projection shift between the auxiliary and target domains, (2) exploits the complementarity of multiple semantic representations, (3) significantly outperforms existing methods for both zero-shot and N-shot recognition on three image and video benchmark datasets, and (4) enables novel cross-view annotation tasks.

摘要

大多数现有的零样本学习方法都利用了迁移学习,通过在带有不同类别的注释辅助数据集和没有注释的目标数据集之间共享中间语义表示来实现。从辅助数据集学习到的从低层次特征空间到语义表示空间的投影,无需适应即可应用于目标数据集。在本文中,我们确定了这些方法的两个内在局限性。首先,由于具有不相交且可能不相关的类,因此从辅助数据集/领域中学到的投影函数在直接应用于目标数据集/领域时会存在偏差。我们将此问题称为投影域偏移问题,并提出了一种新颖的框架,即转导多视图嵌入,以解决此问题。第二个限制是原型稀疏性问题,这意味着对于每个目标类,给定语义表示,零样本学习只能使用单个原型。为了克服这个问题,我们针对转导嵌入空间中的零样本学习,提出了一种新颖的异构多视图超图标签传播方法。它有效地利用了不同语义表示所提供的互补信息,并以一致的方式利用了多个表示空间的流形结构。通过广泛的实验,我们证明了所提出的方法 (1) 纠正了辅助域和目标域之间的投影偏移,(2) 利用了多个语义表示的互补性,(3) 在三个图像和视频基准数据集上,无论是零样本还是 N 样本识别,都显著优于现有方法,(4) 能够实现新的跨视图标注任务。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验