JSE：用于零样本手势学习的联合语义编码器。

JSE: Joint Semantic Encoder for zero-shot gesture learning.

作者信息

Madapana Naveen, Wachs Juan

机构信息

School of Industrial Engineering, Purdue University, West Lafayette IN 47906, United States.

出版信息

Pattern Anal Appl. 2022 Aug;25(3):679-692. doi: 10.1007/s10044-021-00992-y. Epub 2021 Jun 11.

DOI:10.1007/s10044-021-00992-y

PMID:39588314

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11588148/

Abstract

Zero-shot learning (ZSL) is a transfer learning paradigm that aims to recognize unseen categories just by having a high-level description of them. While deep learning has greatly pushed the limits of ZSL for object classification, ZSL for gesture recognition (ZSGL) remains largely unexplored. Previous attempts to address ZSGL were focused on the creation of gesture attributes and algorithmic improvements, and there is little or no research concerned with feature selection for ZSGL. It is indisputable that deep learning has obviated the need for feature engineering for problems with large datasets. However, when the data are scarce, it is critical to leverage the domain information to create discriminative input features. The main goal of this work is to study the effect of three different feature extraction techniques (, and features) on the performance of ZSGL. In addition, we propose a bilinear auto-encoder approach, referred to as Joint Semantic Encoder (JSE), for ZSGL that jointly minimizes the reconstruction, semantic and classification losses. We conducted extensive experiments to compare and contrast the feature extraction techniques and to evaluate the performance of JSE with respect to existing ZSL methods. For classification scenario, irrespective of the feature type, results showed that JSE outperforms other approaches by 5% (<0.01). When JSE is trained with features in condition, we showed that JSE significantly outperforms other methods by 5% (<0.01)).

摘要

零样本学习（ZSL）是一种迁移学习范式，旨在仅通过对未见类别进行高级描述来识别它们。虽然深度学习极大地推动了ZSL在物体分类方面的极限，但用于手势识别的ZSL（ZSGL）在很大程度上仍未得到探索。先前解决ZSGL的尝试主要集中在手势属性的创建和算法改进上，而对于ZSGL的特征选择则很少或没有相关研究。无可争议的是，深度学习已经消除了对大数据集问题进行特征工程的需求。然而，当数据稀缺时，利用领域信息来创建有区分力的输入特征至关重要。这项工作的主要目标是研究三种不同特征提取技术（、和特征）对ZSGL性能的影响。此外，我们提出了一种用于ZSGL的双线性自动编码器方法，称为联合语义编码器（JSE），它联合最小化重构、语义和分类损失。我们进行了广泛的实验，以比较和对比特征提取技术，并评估JSE相对于现有ZSL方法的性能。对于分类场景，无论特征类型如何，结果表明JSE比其他方法性能高出5%（<0.01）。当在条件下使用特征训练JSE时，我们表明JSE比其他方法性能显著高出5%（<0.01）。