• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

连续式日本手语中的非手语内容的识别。

Recognition of Non-Manual Content in Continuous Japanese Sign Language.

机构信息

Honda Research Institute Japan Co., Ltd., Wako-shi, Saitama 351-0188, Japan.

Faculty of Sciences and Engineering, Saarland University, 66123 Saarbrücken, Germany.

出版信息

Sensors (Basel). 2020 Oct 1;20(19):5621. doi: 10.3390/s20195621.

DOI:10.3390/s20195621
PMID:33019608
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7582855/
Abstract

The quality of recognition systems for continuous utterances in signed languages could be largely advanced within the last years. However, research efforts often do not address specific linguistic features of signed languages, as e.g., non-manual expressions. In this work, we evaluate the potential of a single video camera-based recognition system with respect to the latter. For this, we introduce a two-stage pipeline based on two-dimensional body joint positions extracted from RGB camera data. The system first separates the data flow of a signed expression into meaningful word segments on the base of a frame-wise binary Random Forest. Next, every segment is transformed into image-like shape and classified with a Convolutional Neural Network. The proposed system is then evaluated on a data set of continuous sentence expressions in Japanese Sign Language with a variation of non-manual expressions. Exploring multiple variations of data representations and network parameters, we are able to distinguish word segments of specific non-manual intonations with 86% accuracy from the underlying body joint movement data. Full sentence predictions achieve a total Word Error Rate of 15.75%. This marks an improvement of 13.22% as compared to ground truth predictions obtained from labeling insensitive towards non-manual content. Consequently, our analysis constitutes an important contribution for a better understanding of mixed manual and non-manual content in signed communication.

摘要

近年来,手语连续话语识别系统的识别质量得到了很大的提高。然而,研究工作往往没有针对手语的特定语言特征,例如非手语表达。在这项工作中,我们评估了基于单摄像机的识别系统在后者方面的潜力。为此,我们提出了一个基于从 RGB 相机数据中提取的二维身体关节位置的两阶段管道。该系统首先基于逐帧二值随机森林将手语表达式的数据流分为有意义的单词段。接下来,每个片段都被转换为类似图像的形状,并使用卷积神经网络进行分类。然后,我们在具有非手语表达变化的日语手语连续句子表达数据集上评估了所提出的系统。通过探索多种数据表示和网络参数的变化,我们能够以 86%的准确率从底层身体关节运动数据中区分出特定非手语语调的单词段。完整句子的预测总单词错误率为 15.75%。与不敏感于非手语内容的标签相比,这一准确率提高了 13.22%。因此,我们的分析为更好地理解手语交流中的混合手动和非手动内容做出了重要贡献。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/faf7/7582855/24b31de00585/sensors-20-05621-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/faf7/7582855/da2fbb0ac9ee/sensors-20-05621-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/faf7/7582855/479865405a66/sensors-20-05621-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/faf7/7582855/65d61d6d1575/sensors-20-05621-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/faf7/7582855/dd2dca57cbea/sensors-20-05621-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/faf7/7582855/3ae5480766c7/sensors-20-05621-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/faf7/7582855/6cb2339f6c6e/sensors-20-05621-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/faf7/7582855/24b31de00585/sensors-20-05621-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/faf7/7582855/da2fbb0ac9ee/sensors-20-05621-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/faf7/7582855/479865405a66/sensors-20-05621-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/faf7/7582855/65d61d6d1575/sensors-20-05621-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/faf7/7582855/dd2dca57cbea/sensors-20-05621-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/faf7/7582855/3ae5480766c7/sensors-20-05621-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/faf7/7582855/6cb2339f6c6e/sensors-20-05621-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/faf7/7582855/24b31de00585/sensors-20-05621-g007.jpg

相似文献

1
Recognition of Non-Manual Content in Continuous Japanese Sign Language.连续式日本手语中的非手语内容的识别。
Sensors (Basel). 2020 Oct 1;20(19):5621. doi: 10.3390/s20195621.
2
Neural systems supporting linguistic structure, linguistic experience, and symbolic communication in sign language and gesture.支持手语和手势中语言结构、语言经验及符号交流的神经系统。
Proc Natl Acad Sci U S A. 2015 Sep 15;112(37):11684-9. doi: 10.1073/pnas.1510527112. Epub 2015 Aug 17.
3
Extricating Manual and Non-Manual Features for Subunit Level Medical Sign Modelling in Automatic Sign Language Classification and Recognition.在自动手语分类和识别中对亚单位级医学符号进行建模时提取手动和非手动特征。
J Med Syst. 2017 Sep 22;41(11):175. doi: 10.1007/s10916-017-0819-z.
4
Kinect-ing the Dots: Using Motion-Capture Technology to Distinguish Sign Language Linguistic From Gestural Expressions.用 Kinect 捕捉手语语言学和手势表达的差异:运用运动捕捉技术。
Lang Speech. 2024 Mar;67(1):255-276. doi: 10.1177/00238309231169502. Epub 2023 Jun 14.
5
British Sign Language Recognition via Late Fusion of Computer Vision and Leap Motion with Transfer Learning to American Sign Language.基于计算机视觉和 Leap Motion 的迁移学习的英国手语识别与美国手语的融合
Sensors (Basel). 2020 Sep 9;20(18):5151. doi: 10.3390/s20185151.
6
The Impact of Transitional Movements and Non-Manual Markings on the Disambiguation of Locally Ambiguous Argument Structures in Austrian Sign Language (ÖGS).过渡动作和非手势标记对奥地利手语(ÖGS)中局部歧义论证结构消歧的影响。
Lang Speech. 2019 Dec;62(4):652-680. doi: 10.1177/0023830918801399. Epub 2018 Oct 24.
7
Multi-cue temporal modeling for skeleton-based sign language recognition.基于骨骼的手语识别的多线索时间建模
Front Neurosci. 2023 Apr 5;17:1148191. doi: 10.3389/fnins.2023.1148191. eCollection 2023.
8
On the Conventionalization of Mouth Actions in Australian Sign Language.论澳大利亚手语中口部动作的常规化
Lang Speech. 2016 Mar;59(Pt 1):3-42. doi: 10.1177/0023830915569334.
9
Evolving artificial sign languages in the lab: From improvised gesture to systematic sign.实验室中不断演变的人工手语:从即兴手势到系统手语。
Cognition. 2019 Nov;192:103964. doi: 10.1016/j.cognition.2019.05.001. Epub 2019 Jul 11.
10
Subject preference emerges as cross-modal strategy for linguistic processing.主题偏好作为语言处理的跨模态策略出现。
Brain Res. 2018 Jul 15;1691:105-117. doi: 10.1016/j.brainres.2018.03.029. Epub 2018 Apr 5.

引用本文的文献

1
Continuous Sign Language Recognition and Its Translation into Intonation-Colored Speech.连续手语识别及其语调色彩语音的翻译。
Sensors (Basel). 2023 Jul 13;23(14):6383. doi: 10.3390/s23146383.

本文引用的文献

1
OpenPose: Realtime Multi-Person 2D Pose Estimation Using Part Affinity Fields.OpenPose:基于部件亲和力字段的实时多人 2D 姿态估计。
IEEE Trans Pattern Anal Mach Intell. 2021 Jan;43(1):172-186. doi: 10.1109/TPAMI.2019.2929257. Epub 2020 Dec 4.
2
Weakly Supervised Learning with Multi-Stream CNN-LSTM-HMMs to Discover Sequential Parallelism in Sign Language Videos.基于多流 CNN-LSTM-HMM 的弱监督学习发现手语视频中的序列并行性。
IEEE Trans Pattern Anal Mach Intell. 2020 Sep;42(9):2306-2320. doi: 10.1109/TPAMI.2019.2911077. Epub 2019 Apr 15.
3
Discriminative exemplar coding for sign language recognition with Kinect.
基于 Kinect 的手语识别的判别示例编码。
IEEE Trans Cybern. 2013 Oct;43(5):1418-28. doi: 10.1109/TCYB.2013.2265337. Epub 2013 Jun 19.