Suppr超能文献

基于注意力多特征网络的新型时空连续手语识别。

Novel Spatio-Temporal Continuous Sign Language Recognition Using an Attentive Multi-Feature Network.

机构信息

Department of Computer Science and Information Engineering, National Central University, Taoyuan City 32001, Taiwan.

Faculty of Information and Communication Technology, Mahidol University, Nakhon Pathom 73170, Thailand.

出版信息

Sensors (Basel). 2022 Aug 26;22(17):6452. doi: 10.3390/s22176452.

Abstract

Given video streams, we aim to correctly detect unsegmented signs related to continuous sign language recognition (CSLR). Despite the increase in proposed deep learning methods in this area, most of them mainly focus on using only an RGB feature, either the full-frame image or details of hands and face. The scarcity of information for the CSLR training process heavily constrains the capability to learn multiple features using the video input frames. Moreover, exploiting all frames in a video for the CSLR task could lead to suboptimal performance since each frame contains a different level of information, including main features in the inferencing of noise. Therefore, we propose novel spatio-temporal continuous sign language recognition using the attentive multi-feature network to enhance CSLR by providing extra keypoint features. In addition, we exploit the attention layer in the spatial and temporal modules to simultaneously emphasize multiple important features. Experimental results from both CSLR datasets demonstrate that the proposed method achieves superior performance in comparison with current state-of-the-art methods by 0.76 and 20.56 for the WER score on CSL and PHOENIX datasets, respectively.

摘要

给定视频流,我们旨在正确检测与连续手语识别 (CSLR) 相关的未分段标志。尽管在该领域提出了许多深度学习方法,但大多数方法主要侧重于仅使用 RGB 特征,无论是全帧图像还是手部和面部细节。CSLR 训练过程中信息的稀缺严重限制了使用视频输入帧学习多个特征的能力。此外,在 CSLR 任务中使用所有视频帧可能会导致性能不佳,因为每个帧都包含不同程度的信息,包括推断噪声时的主要特征。因此,我们提出了一种新颖的时空连续手语识别方法,使用注意多特征网络通过提供额外的关键点特征来增强 CSLR。此外,我们在空间和时间模块中利用注意力层同时强调多个重要特征。来自 CSLR 数据集的实验结果表明,与当前最先进的方法相比,所提出的方法在 CSL 和 PHOENIX 数据集上的 WER 得分分别提高了 0.76 和 20.56。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/db4a/9460873/e69ff4afc092/sensors-22-06452-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验