Suppr超能文献

BdSL47:一个完整的基于深度的孟加拉语手语字母和数字数据集。

BdSL47: A complete depth-based Bangla sign alphabet and digit dataset.

作者信息

Rayeed S M, Tuba Sidratul Tamzida, Mahmud Hasan, Mazumder Mumtahin Habib Ullah, Mukta Saddam Hossain, Hasan Kamrul

机构信息

Systems and Software Lab (SSL), Department of Computer Science and Engineering (CSE), Islamic University of Technology (IUT), Board Bazar, Gazipur 1704, Bangladesh.

Department of Computer Science and Engineering (CSE), United International University (UIU), United City, Madani Avenue, Dhaka1212, Bangladesh.

出版信息

Data Brief. 2023 Nov 11;51:109799. doi: 10.1016/j.dib.2023.109799. eCollection 2023 Dec.

Abstract

Sign Language Recognition (SLR) is crucial for enabling communication between the deaf-mute and hearing communities. Nevertheless, the development of a comprehensive sign language dataset is a challenging task due to the complexity and variations in hand gestures. This challenge is particularly evident in the case of Bangla Sign Language (BdSL), where the limited availability of depth datasets impedes accurate recognition. To address this issue, we propose BdSL47, an open-access depth dataset for 47 one-handed static signs (10 digits, from ০ to ৯; and 37 letters, from অ to ँ) of BdSL. The dataset was created using the MediaPipe framework for extracting depth information. To classify the signs, we developed an Artificial Neural Network (ANN) model with a 63-node input layer, a 47-node output layer, and 4 hidden layers that included dropout in the last two hidden layers, an Adam optimizer, and a ReLU activation function. Based on the selected hyperparameters, the proposed ANN model effectively learns the spatial relationships and patterns from the depth-based gestural input features and gives an F1 score of 97.84 %, indicating the effectiveness of the approach compared to the baselines provided. The availability of BdSL47 as a comprehensive dataset can have an impact on improving the accuracy of SLR for BdSL using more advanced deep-learning models.

摘要

手语识别(SLR)对于实现聋哑人与听力正常人群之间的交流至关重要。然而,由于手势的复杂性和多样性,开发一个全面的手语数据集是一项具有挑战性的任务。在孟加拉手语(BdSL)的情况下,这一挑战尤为明显,深度数据集的有限可用性阻碍了准确识别。为了解决这个问题,我们提出了BdSL47,这是一个用于BdSL的47种单手静态手势(10个数字,从০到৯;以及37个字母,从অ到ঁ)的开放获取深度数据集。该数据集是使用MediaPipe框架创建的,用于提取深度信息。为了对手势进行分类,我们开发了一个人工神经网络(ANN)模型,该模型具有一个63节点的输入层、一个47节点的输出层和4个隐藏层,其中最后两个隐藏层包含随机失活,使用Adam优化器和ReLU激活函数。基于选定的超参数,所提出的ANN模型有效地从基于深度的手势输入特征中学习空间关系和模式,并给出了97.84%的F1分数,表明与提供的基线相比该方法的有效性。BdSL47作为一个全面数据集的可用性可能会对使用更先进的深度学习模型提高BdSL的SLR准确性产生影响。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4b5b/10700367/7ee0ed32f8f4/gr1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验