阿拉伯语空中书写字母的识别：机器学习、卷积神经网络和光学字符识别（OCR）技术。

Recognition of Arabic Air-Written Letters: Machine Learning, Convolutional Neural Networks, and Optical Character Recognition (OCR) Techniques.

作者信息

Nahar Khalid M O, Alsmadi Izzat, Al Mamlook Rabia Emhamed, Nasayreh Ahmad, Gharaibeh Hasan, Almuflih Ali Saeed, Alasim Fahad

机构信息

Computer Science Department, Faculty of Information Technology and Computer Sciences, Yarmouk University, Irbid 21163, Jordan.

Department of Computing and Cyber Security, Texas A&M University-San Antonio, San Antonio, TX 78224, USA.

出版信息

Sensors (Basel). 2023 Nov 28;23(23):9475. doi: 10.3390/s23239475.

DOI:10.3390/s23239475

PMID:38067848

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10708688/

Abstract

Air writing is one of the essential fields that the world is turning to, which can benefit from the world of the metaverse, as well as the ease of communication between humans and machines. The research literature on air writing and its applications shows significant work in English and Chinese, while little research is conducted in other languages, such as Arabic. To fill this gap, we propose a hybrid model that combines feature extraction with deep learning models and then uses machine learning (ML) and optical character recognition (OCR) methods and applies grid and random search optimization algorithms to obtain the best model parameters and outcomes. Several machine learning methods (e.g., neural networks (NNs), random forest (RF), K-nearest neighbours (KNN), and support vector machine (SVM)) are applied to deep features extracted from deep convolutional neural networks (CNNs), such as VGG16, VGG19, and SqueezeNet. Our study uses the AHAWP dataset, which consists of diverse writing styles and hand sign variations, to train and evaluate the models. Prepossessing schemes are applied to improve data quality by reducing bias. Furthermore, OCR character (OCR) methods are integrated into our model to isolate individual letters from continuous air-written gestures and improve recognition results. The results of this study showed that the proposed model achieved the best accuracy of 88.8% using NN with VGG16.

摘要

空中书写是世界正在转向的重要领域之一，它可以从元宇宙世界中受益，也能提升人机之间的通信便利性。关于空中书写及其应用的研究文献在英文和中文方面有大量工作，但在阿拉伯语等其他语言方面的研究较少。为填补这一空白，我们提出一种混合模型，该模型将特征提取与深度学习模型相结合，然后使用机器学习（ML）和光学字符识别（OCR）方法，并应用网格搜索和随机搜索优化算法来获得最佳模型参数和结果。几种机器学习方法（例如神经网络（NNs）、随机森林（RF）、K近邻（KNN）和支持向量机（SVM））被应用于从深度卷积神经网络（CNNs）（如VGG16、VGG19和SqueezeNet）提取的深度特征。我们的研究使用了AHAWP数据集，该数据集包含各种书写风格和手势变化，用于训练和评估模型。通过减少偏差的预处理方案来提高数据质量。此外，OCR字符（OCR）方法被集成到我们的模型中，以从连续的空中书写手势中分离出单个字母并提高识别结果。这项研究的结果表明，所提出的模型使用带有VGG16的NN实现了88.8%的最佳准确率。