• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用受限玻尔兹曼机的静止图像多模态深度手语识别

Multi-Modal Deep Hand Sign Language Recognition in Still Images Using Restricted Boltzmann Machine.

作者信息

Rastgoo Razieh, Kiani Kourosh, Escalera Sergio

机构信息

Electrical and Computer Engineering Department, Semnan University, Semnan 3513119111, Iran.

Department of Mathematics and Informatics, University of de Barcelona and Computer Vision Center, 08007 Barcelona, Spain.

出版信息

Entropy (Basel). 2018 Oct 23;20(11):809. doi: 10.3390/e20110809.

DOI:10.3390/e20110809
PMID:33266533
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7512373/
Abstract

In this paper, a deep learning approach, Restricted Boltzmann Machine (RBM), is used to perform automatic hand sign language recognition from visual data. We evaluate how RBM, as a deep generative model, is capable of generating the distribution of the input data for an enhanced recognition of unseen data. Two modalities, RGB and Depth, are considered in the model input in three forms: original image, cropped image, and noisy cropped image. Five crops of the input image are used and the hand of these cropped images are detected using Convolutional Neural Network (CNN). After that, three types of the detected hand images are generated for each modality and input to RBMs. The outputs of the RBMs for two modalities are fused in another RBM in order to recognize the output sign label of the input image. The proposed multi-modal model is trained on all and part of the American alphabet and digits of four publicly available datasets. We also evaluate the robustness of the proposal against noise. Experimental results show that the proposed multi-modal model, using crops and the RBM fusing methodology, achieves state-of-the-art results on Massey University Gesture Dataset 2012, American Sign Language (ASL). and Fingerspelling Dataset from the University of Surrey's Center for Vision, Speech and Signal Processing, NYU, and ASL Fingerspelling A datasets.

摘要

在本文中,一种深度学习方法,即受限玻尔兹曼机(RBM),被用于从视觉数据中进行自动手语识别。我们评估了作为深度生成模型的RBM如何能够生成输入数据的分布,以增强对未见数据的识别。模型输入中考虑了RGB和深度这两种模态,有三种形式:原始图像、裁剪后的图像和有噪声的裁剪后的图像。使用输入图像的五幅裁剪图像,并使用卷积神经网络(CNN)检测这些裁剪图像中的手部。之后,针对每种模态生成三种类型的检测到手部图像,并输入到RBM中。将两种模态的RBM输出在另一个RBM中进行融合,以识别输入图像的输出手语标签。所提出的多模态模型在四个公开可用数据集的所有以及部分美国字母表和数字上进行训练。我们还评估了该提议对噪声的鲁棒性。实验结果表明,所提出的使用裁剪图像和RBM融合方法的多模态模型,在梅西大学2012年手势数据集、美国手语(ASL)以及来自萨里大学视觉、语音和信号处理中心、纽约大学的指语数据集和ASL指语A数据集上取得了领先的结果。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/59d4/7512373/f94021fd05c0/entropy-20-00809-g013.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/59d4/7512373/5c9b523f7881/entropy-20-00809-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/59d4/7512373/0a3a3cd406c3/entropy-20-00809-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/59d4/7512373/076744c36e0c/entropy-20-00809-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/59d4/7512373/47d3579f8d73/entropy-20-00809-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/59d4/7512373/2daa4c13325b/entropy-20-00809-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/59d4/7512373/a9bb7a4db5bb/entropy-20-00809-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/59d4/7512373/f14272beeb88/entropy-20-00809-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/59d4/7512373/984f36e7c237/entropy-20-00809-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/59d4/7512373/69287980994e/entropy-20-00809-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/59d4/7512373/39382da96d56/entropy-20-00809-g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/59d4/7512373/b4d586c54286/entropy-20-00809-g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/59d4/7512373/438266296710/entropy-20-00809-g012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/59d4/7512373/f94021fd05c0/entropy-20-00809-g013.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/59d4/7512373/5c9b523f7881/entropy-20-00809-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/59d4/7512373/0a3a3cd406c3/entropy-20-00809-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/59d4/7512373/076744c36e0c/entropy-20-00809-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/59d4/7512373/47d3579f8d73/entropy-20-00809-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/59d4/7512373/2daa4c13325b/entropy-20-00809-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/59d4/7512373/a9bb7a4db5bb/entropy-20-00809-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/59d4/7512373/f14272beeb88/entropy-20-00809-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/59d4/7512373/984f36e7c237/entropy-20-00809-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/59d4/7512373/69287980994e/entropy-20-00809-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/59d4/7512373/39382da96d56/entropy-20-00809-g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/59d4/7512373/b4d586c54286/entropy-20-00809-g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/59d4/7512373/438266296710/entropy-20-00809-g012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/59d4/7512373/f94021fd05c0/entropy-20-00809-g013.jpg

相似文献

1
Multi-Modal Deep Hand Sign Language Recognition in Still Images Using Restricted Boltzmann Machine.使用受限玻尔兹曼机的静止图像多模态深度手语识别
Entropy (Basel). 2018 Oct 23;20(11):809. doi: 10.3390/e20110809.
2
American Sign Language Alphabet Recognition by Extracting Feature from Hand Pose Estimation.从手姿态估计中提取特征实现美国手语字母识别。
Sensors (Basel). 2021 Aug 31;21(17):5856. doi: 10.3390/s21175856.
3
Dataset of Pakistan Sign Language and Automatic Recognition of Hand Configuration of Urdu Alphabet through Machine Learning.巴基斯坦手语数据集及通过机器学习自动识别乌尔都语字母的手势构型
Data Brief. 2021 Apr 2;36:107021. doi: 10.1016/j.dib.2021.107021. eCollection 2021 Jun.
4
Expected energy-based restricted Boltzmann machine for classification.预期基于能量的受限玻尔兹曼机分类。
Neural Netw. 2015 Apr;64:29-38. doi: 10.1016/j.neunet.2014.09.006. Epub 2014 Sep 28.
5
Real-Time Hand Gesture Recognition Using Fine-Tuned Convolutional Neural Network.基于微调卷积神经网络的实时手势识别。
Sensors (Basel). 2022 Jan 18;22(3):706. doi: 10.3390/s22030706.
6
British Sign Language Recognition via Late Fusion of Computer Vision and Leap Motion with Transfer Learning to American Sign Language.基于计算机视觉和 Leap Motion 的迁移学习的英国手语识别与美国手语的融合
Sensors (Basel). 2020 Sep 9;20(18):5151. doi: 10.3390/s20185151.
7
American Sign Language Recognition Using Leap Motion Controller with Machine Learning Approach.使用 Leap Motion 控制器和机器学习方法进行美国手语识别。
Sensors (Basel). 2018 Oct 19;18(10):3554. doi: 10.3390/s18103554.
8
Sign language recognition using the fusion of image and hand landmarks through multi-headed convolutional neural network.基于多头卷积神经网络的融合图像和手地标进行手语识别。
Sci Rep. 2023 Oct 9;13(1):16975. doi: 10.1038/s41598-023-43852-x.
9
Bangla Sign Language (BdSL) Alphabets and Numerals Classification Using a Deep Learning Model.使用深度学习模型对孟加拉手语(BdSL)字母和数字进行分类。
Sensors (Basel). 2022 Jan 12;22(2):574. doi: 10.3390/s22020574.
10
CNN Deep Learning with Wavelet Image Fusion of CCD RGB-IR and Depth-Grayscale Sensor Data for Hand Gesture Intention Recognition.CNN 基于 CCD RGB-IR 与深度灰度传感器数据的子波图像融合的深度学习在手势意图识别中的应用。
Sensors (Basel). 2022 Jan 21;22(3):803. doi: 10.3390/s22030803.

引用本文的文献

1
A non-anatomical graph structure for boundary detection in continuous sign language.一种用于连续手语边界检测的非解剖学图形结构。
Sci Rep. 2025 Jul 16;15(1):25683. doi: 10.1038/s41598-025-11598-3.
2
Sign language recognition using the fusion of image and hand landmarks through multi-headed convolutional neural network.基于多头卷积神经网络的融合图像和手地标进行手语识别。
Sci Rep. 2023 Oct 9;13(1):16975. doi: 10.1038/s41598-023-43852-x.
3
Signer-Independent Arabic Sign Language Recognition System Using Deep Learning Model.基于深度学习模型的无签名者依赖的阿拉伯手语识别系统。

本文引用的文献

1
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks.更快的 R-CNN:基于区域建议网络的实时目标检测。
IEEE Trans Pattern Anal Mach Intell. 2017 Jun;39(6):1137-1149. doi: 10.1109/TPAMI.2016.2577031. Epub 2016 Jun 6.
Sensors (Basel). 2023 Aug 14;23(16):7156. doi: 10.3390/s23167156.
4
Vision-based Pakistani sign language recognition using bag-of-words and support vector machines.基于视觉的巴基斯坦手语识别,使用词袋模型和支持向量机。
Sci Rep. 2022 Dec 9;12(1):21325. doi: 10.1038/s41598-022-15864-6.
5
Applying Hybrid Deep Neural Network for the Recognition of Sign Language Words Used by the Deaf COVID-19 Patients.应用混合深度神经网络识别新冠疫情期间失聪患者使用的手语单词。
Arab J Sci Eng. 2023;48(2):1349-1362. doi: 10.1007/s13369-022-06843-0. Epub 2022 Apr 22.
6
American Sign Language Alphabet Recognition by Extracting Feature from Hand Pose Estimation.从手姿态估计中提取特征实现美国手语字母识别。
Sensors (Basel). 2021 Aug 31;21(17):5856. doi: 10.3390/s21175856.
7
Statistical Machine Learning for Human Behaviour Analysis.用于人类行为分析的统计机器学习
Entropy (Basel). 2020 May 7;22(5):530. doi: 10.3390/e22050530.
8
Entropy Based Data Expansion Method for Blind Image Quality Assessment.基于熵的盲图像质量评估数据扩展方法
Entropy (Basel). 2019 Dec 31;22(1):60. doi: 10.3390/e22010060.