基于人工智能的深度计算机视觉手语识别技术，以帮助听力和言语障碍人士。

Deep computer vision with artificial intelligence based sign language recognition to assist hearing and speech-impaired individuals.

作者信息

Almjally Abrar, Almukadi Wafa Sulaiman

机构信息

Department of Information Technology, College of Computer and Information Sciences, Imam Mohammad Ibn Saud Islamic University (IMSIU), Riyadh, 13318, Saudi Arabia.

King Salman Center for Disability Research, Riyadh, 11614, Saudi Arabia.

出版信息

Sci Rep. 2025 Sep 2;15(1):32268. doi: 10.1038/s41598-025-09106-8.

DOI:10.1038/s41598-025-09106-8

PMID:40890185

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12402225/

Abstract

Sign language (SL) is a non-verbal language applied by deaf and hard-of-hearing individuals for daily communication between them. Studies in SL recognition (SLR) have recently become essential developments. The current successes present the base for upcoming applications to assist the combination of deaf and hard-of-hearing people. SLR could help break down the obstacles for SL users in the community. In general, glove-based and vision-based techniques are the dual major types measured for SLR methods. Several investigators presented various techniques with significant development by deep learning (DL) models in computer vision (CV) and became performed to SLR. This study presents a novel Harris Hawk Optimization-Based Deep Learning Model for Sign Language Recognition (HHODLM-SLR) technique. The HHODLM-SLR technique mainly concentrates on the advanced automatic detection and classification of SL for hearing and speech-impaired individuals. Initially, the image pre-processing stage applies bilateral filtering (BF) to eliminate noise in an input image dataset. Furthermore, the ResNet-152 model is employed for the feature extraction process. The bidirectional long short-term memory (Bi-LSTM) model is used for SLR. Finally, the Harris hawk optimization (HHO) approach optimally adjusts the Bi-LSTM approach's hyperparameter values, resulting in more excellent classification performance. The efficiency of the HHODLM-SLR methodology is validated under the SL dataset. The experimental analysis of the HHODLM-SLR methodology portrayed a superior accuracy value of 98.95% over existing techniques.

摘要

手语（SL）是一种由聋人和听力障碍者用于他们之间日常交流的非语言形式的语言。手语识别（SLR）研究最近已成为重要的发展领域。当前的成果为未来帮助聋人和听力障碍者融合的应用奠定了基础。SLR有助于消除手语使用者在社会中的障碍。一般来说，基于手套和基于视觉的技术是手语识别方法中衡量的两种主要类型。一些研究人员通过计算机视觉（CV）中的深度学习（DL）模型提出了各种具有显著进展的技术，并将其应用于手语识别。本研究提出了一种基于哈里斯鹰优化的手语识别深度学习模型（HHODLM-SLR）技术。HHODLM-SLR技术主要专注于对听力和言语受损个体的手语进行先进的自动检测和分类。首先，图像预处理阶段应用双边滤波（BF）来消除输入图像数据集中的噪声。此外，使用ResNet-152模型进行特征提取过程。双向长短期记忆（Bi-LSTM）模型用于手语识别。最后，哈里斯鹰优化（HHO）方法对手语识别的Bi-LSTM方法的超参数值进行优化调整，从而产生更优异的分类性能。HHODLM-SLR方法的有效性在SL数据集下得到验证。HHODLM-SLR方法的实验分析表明，其准确率比现有技术高出98.95%。