Algethami Nahlah, Farhud Raghad, Alghamdi Manal, Almutairi Huda, Sorani Maha, Aleisa Noura
Computer Science Department, College of Computing and Informatics, Saudi Electronic University, Riyadh 11673, Saudi Arabia.
Information Technology Department, College of Computing and Informatics, Saudi Electronic University, Riyadh 11673, Saudi Arabia.
Sensors (Basel). 2025 May 5;25(9):2916. doi: 10.3390/s25092916.
A significant communication gap persists between the deaf and hearing communities, often leaving deaf individuals isolated and marginalised. This challenge is especially pronounced for Arabic-speaking individuals, given the lack of publicly available Arabic Sign Language datasets and dedicated recognition systems. This study is the first to use the Temporal Convolutional Network (TCN) model for Arabic Sign Language (ArSL) recognition. We created a custom dataset of the 30 most common sentences in ArSL. We improved recognition performance by enhancing a Recurrent Neural Network (RNN) incorporating a Bidirectional Long Short-Term Memory (BiLSTM) model. Our approach achieved outstanding accuracy results compared to baseline RNN-BiLSTM models. This study contributes to developing recognition systems that could bridge communication barriers for the hearing-impaired community. Through a comparative analysis, we assessed the performance of the TCN and the enhanced RNN architecture in capturing the temporal dependencies and semantic nuances unique to Arabic Sign Language. The models are trained and evaluated using the created dataset of Arabic sign gestures based on recognition accuracy, processing speed, and robustness to variations in signing styles. This research provides insights into the strengths and limitations of TCNs and the enhanced RNN-BiLSTM by investigating their applicability in sign language recognition scenarios. The results indicate that the TCN model achieved an accuracy of 99.5%, while the original RNN-BiLSTM model initially achieved a 96% accuracy but improved to 99% after enhancement. While the accuracy gap between the two models was small, the TCN model demonstrated significant advantages in terms of computational efficiency, requiring fewer resources and achieving faster inference times. These factors make TCNs more practical for real-time sign language recognition applications.
聋人与听力正常人群体之间仍然存在着巨大的沟通障碍,这常常使聋人处于孤立和边缘化的境地。鉴于缺乏公开可用的阿拉伯手语数据集和专门的识别系统,这一挑战对讲阿拉伯语的人来说尤为突出。本研究首次将时间卷积网络(TCN)模型用于阿拉伯手语(ArSL)识别。我们创建了一个包含ArSL中30个最常用句子的自定义数据集。我们通过增强一个结合了双向长短期记忆(BiLSTM)模型的递归神经网络(RNN)来提高识别性能。与基线RNN-BiLSTM模型相比,我们的方法取得了出色的准确率结果。本研究有助于开发能够弥合听力障碍群体沟通障碍的识别系统。通过比较分析,我们评估了TCN和增强型RNN架构在捕捉阿拉伯手语特有的时间依赖性和语义细微差别方面的性能。基于识别准确率、处理速度和对手语风格变化的鲁棒性,使用创建的阿拉伯手语手势数据集对模型进行训练和评估。本研究通过调查TCN和增强型RNN-BiLSTM在手语识别场景中的适用性,深入了解了它们的优势和局限性。结果表明,TCN模型的准确率达到了99.5%,而原始的RNN-BiLSTM模型最初的准确率为96%,增强后提高到了99%。虽然两个模型之间的准确率差距很小,但TCN模型在计算效率方面表现出显著优势,需要的资源更少,推理时间更快。这些因素使得TCN在实时手语识别应用中更具实用性。