双语手语识别：一种基于YOLOv11的孟加拉语和英语字母模型。

Bilingual Sign Language Recognition: A YOLOv11-Based Model for Bangla and English Alphabets.

作者信息

Navin Nawshin, Farid Fahmid Al, Rakin Raiyen Z, Tanzim Sadman S, Rahman Mashrur, Rahman Shakila, Uddin Jia, Karim Hezerul Abdul

机构信息

Department of Computer Science, American International University-Bangladesh, Dhaka 1229, Bangladesh.

Centre for Image and Vision Computing (CIVC), COE for Artificial Intelligence, Faculty of Artificial Intelligence and Engineering (FAIE), Multimedia University, Cyberjaya 63100, Selangor, Malaysia.

出版信息

J Imaging. 2025 Apr 27;11(5):134. doi: 10.3390/jimaging11050134.

DOI:10.3390/jimaging11050134

PMID:40422991

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12112066/

Abstract

Communication through sign language effectively helps both hearing- and speaking-impaired individuals connect. However, there are problems with the interlingual communication between Bangla Sign Language (BdSL) and English Sign Language (ASL) due to the absence of a unified system. This study aims to introduce a detection system that incorporates these two sign languages to enhance the flow of communication for those who use these forms of sign language. This study developed and tested a deep learning-based sign-language detection system that can recognize both BdSL and ASL alphabets concurrently in real time. The approach uses a YOLOv11 object detection architecture that has been trained with an open-source dataset on a set of 9556 images containing 64 different letter signs from both languages. Data preprocessing was applied to enhance the performance of the model. Evaluation criteria, including the precision, recall, mAP, and other parameter values were also computed to evaluate the model. The performance analysis of the proposed method shows a precision of 99.12% and average recall rates of 99.63% in 30 epochs. The studies show that the proposed model outperforms the current techniques in sign language recognition (SLR) and can be used in communicating assistive technologies and human-computer interaction systems.

摘要

通过手语进行交流有效地帮助了听力和语言有障碍的人建立联系。然而，由于缺乏统一的系统，孟加拉手语（BdSL）和美国手语（ASL）之间的跨语言交流存在问题。本研究旨在引入一个整合这两种手语的检测系统，以促进使用这些手语形式的人的交流顺畅。本研究开发并测试了一种基于深度学习的手语检测系统，该系统能够实时同时识别BdSL和ASL字母。该方法使用YOLOv11目标检测架构，该架构已在一个开源数据集上进行训练，该数据集包含一组9556张图像，其中有来自两种语言的64种不同字母手势。应用了数据预处理以提高模型的性能。还计算了包括精度、召回率、平均精度均值（mAP）和其他参数值在内的评估标准来评估模型。所提方法的性能分析表明，在30个轮次中精度为99.12%，平均召回率为99.63%。研究表明，所提模型在手语识别（SLR）方面优于当前技术，可用于通信辅助技术和人机交互系统。