• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用改进的深度学习网络和混合优化的手语识别:一种基于混合优化器(HO)的优化卷积神经网络-长短时记忆网络(CNNSa-LSTM)方法。

Sign language recognition using modified deep learning network and hybrid optimization: a hybrid optimizer (HO) based optimized CNNSa-LSTM approach.

作者信息

Baihan Abdullah, Alutaibi Ahmed I, Alshehri Mohammed, Sharma Sunil Kumar

机构信息

Computer Science Department, Community College, King Saud University, 11437, Riyadh, Saudi Arabia.

Department of Computer Engineering, College of Computer and Information Sciences, Majmaah University, 11952, Majmaah, Saudi Arabia.

出版信息

Sci Rep. 2024 Oct 30;14(1):26111. doi: 10.1038/s41598-024-76174-7.

DOI:10.1038/s41598-024-76174-7
PMID:39477993
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11525556/
Abstract

Speech impairment limits a person's capacity for oral and auditory communication. Improvements in communication between the deaf and the general public can be progressed by a real-time sign language detector. Recent studies have contributed to make progress in motion and gesture identification processes using Deep Learning (DL) methods and computer vision. But the development of static and dynamic sign language recognition (SLR) models is still a challenging area of research. The difficulty is in obtaining an appropriate model that addresses the challenges of continuous signs that are independent of the signer. Different signers' speeds, durations, and many other factors make it challenging to create a model with high accuracy and continuity. This study mainly focused on SLR using a modified DL and hybrid optimization approach. Notably, spatial and geometric-based features are extracted via the Visual Geometry Group 16 (VGG16), and motion features are extracted using the optical flow approach. A new DL model, CNNSa-LSTM, is a combination of a Convolutional Neural Network (CNN), Self-Attention (SA), and Long-Short-Term Memory (LSTM) to identify sign language. This model is developed for feature extraction by combining CNNs for spatial analysis with SA mechanisms for focusing on relevant features, while LSTM effectively models temporal dependencies. The proposed CNNSa-LSTM model enhances performance in tasks involving complex, sequential data, such as sign language processing. Besides, a Hybrid Optimizer (HO) is proposed using the Hippopotamus Optimization Algorithm (HOA) and the Pathfinder Algorithm (PFA). The proposed model has been implemented in Python, and it has been evaluated over the existing models in terms of accuracy (98.7%), sensitivity (98.2%), precision (98.5%), Word Error Rate (WER) (0.131), Sign Error Rate (SER) (0.114), and Normalized Discounted Cumulative Gain (NDCG) (98%) as well. The proposed model has recorded the highest accuracy of 98.7%.

摘要

言语障碍限制了一个人的口语和听觉交流能力。实时手语检测器可以促进聋人与普通大众之间交流的改善。最近的研究推动了使用深度学习(DL)方法和计算机视觉在运动和手势识别过程方面取得进展。但是静态和动态手语识别(SLR)模型的开发仍然是一个具有挑战性的研究领域。困难在于获得一个合适的模型来应对与手语者无关的连续手语的挑战。不同手语者的速度、时长以及许多其他因素使得创建一个具有高精度和连续性的模型具有挑战性。本研究主要关注使用改进的DL和混合优化方法进行手语识别。值得注意的是,基于空间和几何的特征通过视觉几何组16(VGG16)提取,运动特征使用光流方法提取。一种新的DL模型,CNNSa-LSTM,是卷积神经网络(CNN)、自注意力(SA)和长短期记忆(LSTM)的组合,用于识别手语。该模型通过将用于空间分析的CNN与用于关注相关特征的SA机制相结合来进行特征提取,而LSTM有效地对时间依赖性进行建模。所提出的CNNSa-LSTM模型在涉及复杂序列数据的任务(如手语处理)中提高了性能。此外,还提出了一种使用河马优化算法(HOA)和探路者算法(PFA)的混合优化器(HO)。所提出的模型已在Python中实现,并在准确性(98.7%)、灵敏度(98.2%)、精度(98.5%)、单词错误率(WER)(0.131)、手语错误率(SER)(0.114)和归一化折损累计增益(NDCG)(98%)方面与现有模型进行了评估。所提出的模型记录了最高98.7%的准确率。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8846/11525556/cc01a1a9badd/41598_2024_76174_Fig20_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8846/11525556/df23cde793c1/41598_2024_76174_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8846/11525556/d9ce75780d94/41598_2024_76174_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8846/11525556/2ed31b0646c5/41598_2024_76174_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8846/11525556/956071506692/41598_2024_76174_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8846/11525556/6f3b0d086674/41598_2024_76174_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8846/11525556/c3565815f309/41598_2024_76174_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8846/11525556/b6e982e1ad6c/41598_2024_76174_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8846/11525556/36dd01e3f572/41598_2024_76174_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8846/11525556/5e80bb6750b7/41598_2024_76174_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8846/11525556/7b36e5129e53/41598_2024_76174_Fig10_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8846/11525556/06db83c811b8/41598_2024_76174_Fig11_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8846/11525556/8521e2e60113/41598_2024_76174_Fig12_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8846/11525556/9d01706e3555/41598_2024_76174_Fig13_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8846/11525556/d5d443f791dd/41598_2024_76174_Fig14_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8846/11525556/2a923a1c2f96/41598_2024_76174_Fig15_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8846/11525556/a96546b3d360/41598_2024_76174_Fig16_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8846/11525556/7883ebc3a434/41598_2024_76174_Fig17_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8846/11525556/4dc40430cd3a/41598_2024_76174_Fig18_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8846/11525556/6e0e3e16eb99/41598_2024_76174_Fig19_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8846/11525556/cc01a1a9badd/41598_2024_76174_Fig20_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8846/11525556/df23cde793c1/41598_2024_76174_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8846/11525556/d9ce75780d94/41598_2024_76174_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8846/11525556/2ed31b0646c5/41598_2024_76174_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8846/11525556/956071506692/41598_2024_76174_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8846/11525556/6f3b0d086674/41598_2024_76174_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8846/11525556/c3565815f309/41598_2024_76174_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8846/11525556/b6e982e1ad6c/41598_2024_76174_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8846/11525556/36dd01e3f572/41598_2024_76174_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8846/11525556/5e80bb6750b7/41598_2024_76174_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8846/11525556/7b36e5129e53/41598_2024_76174_Fig10_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8846/11525556/06db83c811b8/41598_2024_76174_Fig11_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8846/11525556/8521e2e60113/41598_2024_76174_Fig12_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8846/11525556/9d01706e3555/41598_2024_76174_Fig13_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8846/11525556/d5d443f791dd/41598_2024_76174_Fig14_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8846/11525556/2a923a1c2f96/41598_2024_76174_Fig15_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8846/11525556/a96546b3d360/41598_2024_76174_Fig16_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8846/11525556/7883ebc3a434/41598_2024_76174_Fig17_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8846/11525556/4dc40430cd3a/41598_2024_76174_Fig18_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8846/11525556/6e0e3e16eb99/41598_2024_76174_Fig19_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8846/11525556/cc01a1a9badd/41598_2024_76174_Fig20_HTML.jpg

相似文献

1
Sign language recognition using modified deep learning network and hybrid optimization: a hybrid optimizer (HO) based optimized CNNSa-LSTM approach.使用改进的深度学习网络和混合优化的手语识别:一种基于混合优化器(HO)的优化卷积神经网络-长短时记忆网络(CNNSa-LSTM)方法。
Sci Rep. 2024 Oct 30;14(1):26111. doi: 10.1038/s41598-024-76174-7.
2
Automated sign language detection and classification using reptile search algorithm with hybrid deep learning.使用带有混合深度学习的爬虫搜索算法进行自动手语检测与分类
Heliyon. 2023 Dec 8;10(1):e23252. doi: 10.1016/j.heliyon.2023.e23252. eCollection 2024 Jan 15.
3
Real-Time Arabic Sign Language Recognition Using a Hybrid Deep Learning Model.基于混合深度学习模型的实时阿拉伯手语识别
Sensors (Basel). 2024 Jun 6;24(11):3683. doi: 10.3390/s24113683.
4
Video-Based Sign Language Recognition via ResNet and LSTM Network.基于视频的手语识别:通过ResNet和LSTM网络实现
J Imaging. 2024 Jun 20;10(6):149. doi: 10.3390/jimaging10060149.
5
Atom Search Optimization with Deep Learning Enabled Arabic Sign Language Recognition for Speaking and Hearing Disability Persons.基于深度学习的原子搜索优化算法用于聋哑人士阿拉伯语手语识别
Healthcare (Basel). 2022 Aug 24;10(9):1606. doi: 10.3390/healthcare10091606.
6
Self-attention (SA) temporal convolutional network (SATCN)-long short-term memory neural network (SATCN-LSTM): an advanced python code for predicting groundwater level.自注意力 (SA) 时间卷积网络 (SATCN)-长短时记忆神经网络 (SATCN-LSTM):一个用于预测地下水位的高级 Python 代码。
Environ Sci Pollut Res Int. 2023 Aug;30(40):92903-92921. doi: 10.1007/s11356-023-28771-8. Epub 2023 Jul 27.
7
Signer-Independent Arabic Sign Language Recognition System Using Deep Learning Model.基于深度学习模型的无签名者依赖的阿拉伯手语识别系统。
Sensors (Basel). 2023 Aug 14;23(16):7156. doi: 10.3390/s23167156.
8
A transfer learning-based CNN and LSTM hybrid deep learning model to classify motor imagery EEG signals.一种基于迁移学习的卷积神经网络和长短期记忆网络混合深度学习模型,用于对运动想象脑电信号进行分类。
Comput Biol Med. 2022 Apr;143:105288. doi: 10.1016/j.compbiomed.2022.105288. Epub 2022 Feb 10.
9
S-LSTM-ATT: a hybrid deep learning approach with optimized features for emotion recognition in electroencephalogram.S-LSTM-ATT:一种用于脑电图情感识别的具有优化特征的混合深度学习方法。
Health Inf Sci Syst. 2023 Aug 29;11(1):40. doi: 10.1007/s13755-023-00242-x. eCollection 2023 Dec.
10
Sign Language Recognition Using the Electromyographic Signal: A Systematic Literature Review.使用肌电图信号的手语识别:系统文献综述。
Sensors (Basel). 2023 Oct 9;23(19):8343. doi: 10.3390/s23198343.

引用本文的文献

1
Attention-based hybrid deep learning model with CSFOA optimization and G-TverskyUNet3+ for Arabic sign language recognition.基于注意力的混合深度学习模型,采用CSFOA优化和G-TverskyUNet3+进行阿拉伯手语识别。
Sci Rep. 2025 Jun 26;15(1):20313. doi: 10.1038/s41598-025-03560-0.
2
MHO: A Modified Hippopotamus Optimization Algorithm for Global Optimization and Engineering Design Problems.MHO:一种用于全局优化和工程设计问题的改进型河马优化算法
Biomimetics (Basel). 2025 Feb 5;10(2):90. doi: 10.3390/biomimetics10020090.

本文引用的文献

1
Sign language recognition based on dual-path background erasure convolutional neural network.基于双路径背景消除卷积神经网络的手语识别
Sci Rep. 2024 May 18;14(1):11360. doi: 10.1038/s41598-024-62008-z.
2
Hippopotamus optimization algorithm: a novel nature-inspired optimization algorithm.河马优化算法:一种新型的自然启发式优化算法。
Sci Rep. 2024 Feb 29;14(1):5032. doi: 10.1038/s41598-024-54910-3.
3
Sign language recognition using the fusion of image and hand landmarks through multi-headed convolutional neural network.基于多头卷积神经网络的融合图像和手地标进行手语识别。
Sci Rep. 2023 Oct 9;13(1):16975. doi: 10.1038/s41598-023-43852-x.
4
Recognition of Urdu sign language: a systematic review of the machine learning classification.乌尔都语手语识别:机器学习分类的系统综述
PeerJ Comput Sci. 2022 Feb 18;8:e883. doi: 10.7717/peerj-cs.883. eCollection 2022.
5
AI enabled sign language recognition and VR space bidirectional communication using triboelectric smart glove.利用摩擦电智能手套实现 AI 手语识别和 VR 空间双向通信。
Nat Commun. 2021 Sep 10;12(1):5378. doi: 10.1038/s41467-021-25637-w.