• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于上下文感知生成对抗网络的连续手语识别。

Continuous Sign Language Recognition through a Context-Aware Generative Adversarial Network.

机构信息

Visual Computing Lab at Information Technologies Institute of Centre for Research and Technology Hellas, VCL of CERTH/ITI Hellas, 57001 Thessaloniki, Greece.

出版信息

Sensors (Basel). 2021 Apr 1;21(7):2437. doi: 10.3390/s21072437.

DOI:10.3390/s21072437
PMID:33916231
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8038055/
Abstract

Continuous sign language recognition is a weakly supervised task dealing with the identification of continuous sign gestures from video sequences, without any prior knowledge about the temporal boundaries between consecutive signs. Most of the existing methods focus mainly on the extraction of spatio-temporal visual features without exploiting text or contextual information to further improve the recognition accuracy. Moreover, the ability of deep generative models to effectively model data distribution has not been investigated yet in the field of sign language recognition. To this end, a novel approach for context-aware continuous sign language recognition using a generative adversarial network architecture, named as Sign Language Recognition Generative Adversarial Network (SLRGAN), is introduced. The proposed network architecture consists of a generator that recognizes sign language glosses by extracting spatial and temporal features from video sequences, as well as a discriminator that evaluates the quality of the generator's predictions by modeling text information at the sentence and gloss levels. The paper also investigates the importance of contextual information on sign language conversations for both Deaf-to-Deaf and Deaf-to-hearing communication. Contextual information, in the form of hidden states extracted from the previous sentence, is fed into the bidirectional long short-term memory module of the generator to improve the recognition accuracy of the network. At the final stage, sign language translation is performed by a transformer network, which converts sign language glosses to natural language text. Our proposed method achieved word error rates of 23.4%, 2.1% and 2.26% on the RWTH-Phoenix-Weather-2014 and the Chinese Sign Language (CSL) and Greek Sign Language (GSL) Signer Independent (SI) datasets, respectively.

摘要

连续手语识别是一项弱监督任务,涉及从视频序列中识别连续手语手势,而无需有关连续手势之间的时间边界的任何先验知识。现有的大多数方法主要侧重于提取时空视觉特征,而没有利用文本或上下文信息来进一步提高识别准确性。此外,深度生成模型有效地对数据分布进行建模的能力尚未在手语识别领域进行研究。为此,引入了一种使用生成对抗网络架构的基于上下文感知的连续手语识别新方法,称为手语识别生成对抗网络(SLRGAN)。所提出的网络架构由生成器组成,该生成器通过从视频序列中提取空间和时间特征来识别手语释义,以及鉴别器,该鉴别器通过在句子和释义级别建模文本信息来评估生成器预测的质量。本文还研究了上下文信息在手语对话中对聋人对聋人和聋人对听力交流的重要性。以从前一句话中提取的隐藏状态的形式提供上下文信息,并将其输入到生成器的双向长短期记忆模块中,以提高网络的识别精度。在最后阶段,通过转换器网络对手语释义进行翻译,该网络将手语释义转换为自然语言文本。我们的方法在 RWTH-Phoenix-Weather-2014 数据集以及中文手语(CSL)和希腊手语(GSL)签名者独立(SI)数据集上的词错误率分别为 23.4%,2.1%和 2.26%。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/027c/8038055/9a9a535dee74/sensors-21-02437-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/027c/8038055/31843fe7a399/sensors-21-02437-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/027c/8038055/d924761f2e6a/sensors-21-02437-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/027c/8038055/805d1bcdeddc/sensors-21-02437-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/027c/8038055/b1082578a04a/sensors-21-02437-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/027c/8038055/b780de8fa6d7/sensors-21-02437-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/027c/8038055/9a9a535dee74/sensors-21-02437-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/027c/8038055/31843fe7a399/sensors-21-02437-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/027c/8038055/d924761f2e6a/sensors-21-02437-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/027c/8038055/805d1bcdeddc/sensors-21-02437-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/027c/8038055/b1082578a04a/sensors-21-02437-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/027c/8038055/b780de8fa6d7/sensors-21-02437-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/027c/8038055/9a9a535dee74/sensors-21-02437-g006.jpg

相似文献

1
Continuous Sign Language Recognition through a Context-Aware Generative Adversarial Network.基于上下文感知生成对抗网络的连续手语识别。
Sensors (Basel). 2021 Apr 1;21(7):2437. doi: 10.3390/s21072437.
2
Skeleton-based Chinese sign language recognition and generation for bidirectional communication between deaf and hearing people.基于骨架的中文手语识别与生成,实现聋听人群的双向交流。
Neural Netw. 2020 May;125:41-55. doi: 10.1016/j.neunet.2020.01.030. Epub 2020 Feb 6.
3
UltrasonicGS: A Highly Robust Gesture and Sign Language Recognition Method Based on Ultrasonic Signals.基于超声信号的高鲁棒性手势和手语识别方法:UltrasonicGS
Sensors (Basel). 2023 Feb 5;23(4):1790. doi: 10.3390/s23041790.
4
Cross-modal knowledge distillation for continuous sign language recognition.跨模态知识迁移在连续手语识别中的应用。
Neural Netw. 2024 Nov;179:106587. doi: 10.1016/j.neunet.2024.106587. Epub 2024 Jul 30.
5
A Novel Phonology- and Radical-Coded Chinese Sign Language Recognition Framework Using Accelerometer and Surface Electromyography Sensors.一种使用加速度计和表面肌电图传感器的新颖的基于音韵和部首编码的中国手语识别框架。
Sensors (Basel). 2015 Sep 15;15(9):23303-24. doi: 10.3390/s150923303.
6
Extricating Manual and Non-Manual Features for Subunit Level Medical Sign Modelling in Automatic Sign Language Classification and Recognition.在自动手语分类和识别中对亚单位级医学符号进行建模时提取手动和非手动特征。
J Med Syst. 2017 Sep 22;41(11):175. doi: 10.1007/s10916-017-0819-z.
7
Novel Spatio-Temporal Continuous Sign Language Recognition Using an Attentive Multi-Feature Network.基于注意力多特征网络的新型时空连续手语识别。
Sensors (Basel). 2022 Aug 26;22(17):6452. doi: 10.3390/s22176452.
8
An Attention-Enhanced Multi-Scale and Dual Sign Language Recognition Network Based on a Graph Convolution Network.基于图卷积网络的注意力增强多尺度双通道手语识别网络。
Sensors (Basel). 2021 Feb 5;21(4):1120. doi: 10.3390/s21041120.
9
A Component-Based Vocabulary-Extensible Sign Language Gesture Recognition Framework.一种基于组件的词汇可扩展手语手势识别框架。
Sensors (Basel). 2016 Apr 19;16(4):556. doi: 10.3390/s16040556.
10
American Sign Language Recognition and Translation Using Perception Neuron Wearable Inertial Motion Capture System.基于感知神经元可穿戴惯性运动捕捉系统的美国手语识别与翻译。
Sensors (Basel). 2024 Jan 11;24(2):453. doi: 10.3390/s24020453.

引用本文的文献

1
Continuous Sign Language Recognition and Its Translation into Intonation-Colored Speech.连续手语识别及其语调色彩语音的翻译。
Sensors (Basel). 2023 Jul 13;23(14):6383. doi: 10.3390/s23146383.
2
Sign2Pose: A Pose-Based Approach for Gloss Prediction Using a Transformer Model.Sign2Pose:一种基于姿势的方法,使用转换器模型进行 Gloss 预测。
Sensors (Basel). 2023 Mar 6;23(5):2853. doi: 10.3390/s23052853.
3
A Sign Language Recognition System Applied to Deaf-Mute Medical Consultation.手语识别系统在聋哑人医疗咨询中的应用。

本文引用的文献

1
A survey on generative adversarial networks for imbalance problems in computer vision tasks.关于计算机视觉任务中不平衡问题的生成对抗网络调查。
J Big Data. 2021;8(1):27. doi: 10.1186/s40537-021-00414-0. Epub 2021 Jan 29.
2
Deep Spatio-Temporal Representation and Ensemble Classification for Attention Deficit/Hyperactivity Disorder.深度时空表示与集成分类在注意缺陷多动障碍中的应用。
IEEE Trans Neural Syst Rehabil Eng. 2021;29:1-10. doi: 10.1109/TNSRE.2020.3019063. Epub 2021 Feb 25.
3
Weakly Supervised Learning with Multi-Stream CNN-LSTM-HMMs to Discover Sequential Parallelism in Sign Language Videos.
Sensors (Basel). 2022 Nov 24;22(23):9107. doi: 10.3390/s22239107.
4
Context-Aware Automatic Sign Language Video Transcription in Psychiatric Interviews.情境感知的精神病访谈中自动手语视频转录。
Sensors (Basel). 2022 Mar 30;22(7):2656. doi: 10.3390/s22072656.
5
Artificial Intelligence Technologies for Sign Language.手语的人工智能技术。
Sensors (Basel). 2021 Aug 30;21(17):5843. doi: 10.3390/s21175843.
6
Editorial: Artificial Intelligence and Human Movement in Industries and Creation.社论:人工智能与工业及创作中的人类活动
Front Robot AI. 2021 Jul 12;8:712521. doi: 10.3389/frobt.2021.712521. eCollection 2021.
基于多流 CNN-LSTM-HMM 的弱监督学习发现手语视频中的序列并行性。
IEEE Trans Pattern Anal Mach Intell. 2020 Sep;42(9):2306-2320. doi: 10.1109/TPAMI.2019.2911077. Epub 2019 Apr 15.
4
Context-Aware Visual Policy Network for Fine-Grained Image Captioning.上下文感知视觉策略网络在细粒度图像标题生成中的应用
IEEE Trans Pattern Anal Mach Intell. 2022 Feb;44(2):710-722. doi: 10.1109/TPAMI.2019.2909864. Epub 2022 Jan 7.
5
Context-Aware Mouse Behavior Recognition Using Hidden Markov Models.基于隐马尔可夫模型的上下文感知鼠标行为识别。
IEEE Trans Image Process. 2019 Mar;28(3):1133-1148. doi: 10.1109/TIP.2018.2875335. Epub 2018 Oct 10.