Suppr超能文献

基于双路径背景消除卷积神经网络的手语识别

Sign language recognition based on dual-path background erasure convolutional neural network.

作者信息

Zhang Junming, Bu Xiaolong, Wang Yushuai, Dong Hao, Zhang Yu, Wu Haitao

机构信息

School of Computer and Artificial Intelligence, Huanghuai University, Zhumadian, 463000, Henan Province, China.

Key Laboratory of Intelligent Lighting, Henan Province, Zhumadian, 463000, China.

出版信息

Sci Rep. 2024 May 18;14(1):11360. doi: 10.1038/s41598-024-62008-z.

Abstract

Sign language is an important way to provide expression information to people with hearing and speaking disabilities. Therefore, sign language recognition has always been a very important research topic. However, many sign language recognition systems currently require complex deep models and rely on expensive sensors, which limits the application scenarios of sign language recognition. To address this issue, based on computer vision, this study proposed a lightweight, dual-path background erasing deep convolutional neural network (DPCNN) model for sign language recognition. The DPCNN consists of two paths. One path is used to learn the overall features, while the other path learns the background features. The background features are gradually subtracted from the overall features to obtain an effective representation of hand features. Then, these features are flatten into a one-dimensional layer, and pass through a fully connected layer with an output unit of 128. Finally, use a fully connected layer with an output unit of 24 as the output layer. Based on the ASL Finger Spelling dataset, the total accuracy and Macro-F1 scores of the proposed method is 99.52% and 0.997, respectively. More importantly, the proposed method can be applied to small terminals, thereby improving the application scenarios of sign language recognition. Through experimental comparison, the dual path background erasure network model proposed in this paper has better generalization ability.

摘要

手语是向听力和语言有障碍的人提供表达信息的重要方式。因此,手语识别一直是一个非常重要的研究课题。然而,目前许多手语识别系统需要复杂的深度模型,并且依赖昂贵的传感器,这限制了手语识别的应用场景。为了解决这个问题,基于计算机视觉,本研究提出了一种用于手语识别的轻量级双路径背景擦除深度卷积神经网络(DPCNN)模型。DPCNN由两条路径组成。一条路径用于学习整体特征,而另一条路径学习背景特征。背景特征从整体特征中逐渐减去,以获得手部特征的有效表示。然后,将这些特征展平为一维层,并通过一个输出单元为128的全连接层。最后,使用一个输出单元为24的全连接层作为输出层。基于美国手语手指拼写数据集,该方法的总准确率和宏F1分数分别为99.52%和0.997。更重要的是,该方法可以应用于小型终端,从而改善手语识别的应用场景。通过实验比较,本文提出的双路径背景擦除网络模型具有更好的泛化能力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fc97/11102471/557dea04e0c5/41598_2024_62008_Fig1_HTML.jpg

相似文献

1
Sign language recognition based on dual-path background erasure convolutional neural network.
Sci Rep. 2024 May 18;14(1):11360. doi: 10.1038/s41598-024-62008-z.
2
Hypertuned Deep Convolutional Neural Network for Sign Language Recognition.
Comput Intell Neurosci. 2022 Apr 30;2022:1450822. doi: 10.1155/2022/1450822. eCollection 2022.
3
Video-Based Sign Language Recognition via ResNet and LSTM Network.
J Imaging. 2024 Jun 20;10(6):149. doi: 10.3390/jimaging10060149.
4
Improved 3D-ResNet sign language recognition algorithm with enhanced hand features.
Sci Rep. 2022 Oct 24;12(1):17812. doi: 10.1038/s41598-022-21636-z.
7
American Sign Language Alphabet Recognition by Extracting Feature from Hand Pose Estimation.
Sensors (Basel). 2021 Aug 31;21(17):5856. doi: 10.3390/s21175856.
8
BdSL47: A complete depth-based Bangla sign alphabet and digit dataset.
Data Brief. 2023 Nov 11;51:109799. doi: 10.1016/j.dib.2023.109799. eCollection 2023 Dec.
9
Spatial Attention-Based 3D Graph Convolutional Neural Network for Sign Language Recognition.
Sensors (Basel). 2022 Jun 16;22(12):4558. doi: 10.3390/s22124558.
10
Convolutional and recurrent neural network for human activity recognition: Application on American sign language.
PLoS One. 2020 Feb 19;15(2):e0228869. doi: 10.1371/journal.pone.0228869. eCollection 2020.

本文引用的文献

1
Hand Gesture Recognition Using FSK Radar Sensors.
Sensors (Basel). 2024 Jan 6;24(2):349. doi: 10.3390/s24020349.
3
Real-Time Hand Gesture Recognition Using Fine-Tuned Convolutional Neural Network.
Sensors (Basel). 2022 Jan 18;22(3):706. doi: 10.3390/s22030706.
5
American Sign Language Alphabet Recognition by Extracting Feature from Hand Pose Estimation.
Sensors (Basel). 2021 Aug 31;21(17):5856. doi: 10.3390/s21175856.
6
Weakly Supervised Learning with Multi-Stream CNN-LSTM-HMMs to Discover Sequential Parallelism in Sign Language Videos.
IEEE Trans Pattern Anal Mach Intell. 2020 Sep;42(9):2306-2320. doi: 10.1109/TPAMI.2019.2911077. Epub 2019 Apr 15.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验