• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过优化深度学习识别手写体普什图语数字。

Recognition of inscribed cursive Pashtu numeral through optimized deep learning.

作者信息

Syed Sibtain, Khan Khalil, Khan Maqbool, Khan Rehan Ullah, Aloraini Abdulrahman

机构信息

Department of IT & CS, Pak-Austria Fachhochschule Institute of Applied Sciences and Technology, Haripur, KP, Pakistan.

Department of Computer Science, School of Engineering and Digital Sciences, Nazarbayev University, Astana, Kazakhstan.

出版信息

PeerJ Comput Sci. 2024 Jul 11;10:e2124. doi: 10.7717/peerj-cs.2124. eCollection 2024.

DOI:10.7717/peerj-cs.2124
PMID:39145239
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11323096/
Abstract

Pashtu is one of the most widely spoken languages in south-east Asia. Pashtu Numerics recognition poses challenges due to its cursive nature. Despite this, employing a machine learning-based optical character recognition (OCR) model can be an effective way to tackle this issue. The main aim of the study is to propose an optimized machine learning model which can efficiently identify Pashtu numerics from 0-9. The methodology includes data organizing into different directories each representing labels. After that, the data is preprocessed , images are resized to 32 × 32 images, then they are normalized by dividing their pixel value by 255, and the data is reshaped for model input. The dataset was split in the ratio of 80:20. After this, optimized hyperparameters were selected for LSTM and CNN models with the help of trial-and-error technique. Models were evaluated by accuracy and loss graphs, classification report, and confusion matrix. The results indicate that the proposed LSTM model slightly outperforms the proposed CNN model with a macro-average of precision: 0.9877, recall: 0.9876, F1 score: 0.9876. Both models demonstrate remarkable performance in accurately recognizing Pashtu numerics, achieving an accuracy level of nearly 98%. Notably, the LSTM model exhibits a marginal advantage over the CNN model in this regard.

摘要

普什图语是东南亚使用最广泛的语言之一。由于其书写方式为草书,普什图语数字识别面临挑战。尽管如此,采用基于机器学习的光学字符识别(OCR)模型可能是解决这一问题的有效方法。该研究的主要目的是提出一种优化的机器学习模型,该模型能够有效地识别0到9的普什图语数字。方法包括将数据组织到不同的目录中,每个目录代表一个标签。之后,对数据进行预处理,将图像调整为32×32的图像,然后通过将其像素值除以255进行归一化,并对数据进行重塑以用于模型输入。数据集按80:20的比例拆分。在此之后,借助试错技术为长短期记忆网络(LSTM)和卷积神经网络(CNN)模型选择优化的超参数。通过准确率和损失图、分类报告以及混淆矩阵对模型进行评估。结果表明,所提出的LSTM模型略优于所提出的CNN模型,宏平均精度为0.9877,召回率为0.9876,F1分数为0.9876。两个模型在准确识别普什图语数字方面都表现出卓越的性能,准确率达到近98%。值得注意的是,在这方面LSTM模型比CNN模型表现出微弱优势。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1645/11323096/8da8c8be51dc/peerj-cs-10-2124-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1645/11323096/3c2c3a28a1c2/peerj-cs-10-2124-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1645/11323096/c19150fb0ed0/peerj-cs-10-2124-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1645/11323096/92cc8ad7057c/peerj-cs-10-2124-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1645/11323096/a3e2be1a94ca/peerj-cs-10-2124-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1645/11323096/00ba4ae14229/peerj-cs-10-2124-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1645/11323096/639215e1e822/peerj-cs-10-2124-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1645/11323096/de5d83e2c085/peerj-cs-10-2124-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1645/11323096/da2dfefa9301/peerj-cs-10-2124-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1645/11323096/8da8c8be51dc/peerj-cs-10-2124-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1645/11323096/3c2c3a28a1c2/peerj-cs-10-2124-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1645/11323096/c19150fb0ed0/peerj-cs-10-2124-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1645/11323096/92cc8ad7057c/peerj-cs-10-2124-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1645/11323096/a3e2be1a94ca/peerj-cs-10-2124-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1645/11323096/00ba4ae14229/peerj-cs-10-2124-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1645/11323096/639215e1e822/peerj-cs-10-2124-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1645/11323096/de5d83e2c085/peerj-cs-10-2124-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1645/11323096/da2dfefa9301/peerj-cs-10-2124-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1645/11323096/8da8c8be51dc/peerj-cs-10-2124-g009.jpg

相似文献

1
Recognition of inscribed cursive Pashtu numeral through optimized deep learning.通过优化深度学习识别手写体普什图语数字。
PeerJ Comput Sci. 2024 Jul 11;10:e2124. doi: 10.7717/peerj-cs.2124. eCollection 2024.
2
Pashtu Language Digits Dataset.普什图语数字数据集。
Data Brief. 2022 Oct 26;45:108701. doi: 10.1016/j.dib.2022.108701. eCollection 2022 Dec.
3
Detection of sweet corn seed viability based on hyperspectral imaging combined with firefly algorithm optimized deep learning.基于高光谱成像结合萤火虫算法优化深度学习的甜玉米种子活力检测
Front Plant Sci. 2024 May 1;15:1361309. doi: 10.3389/fpls.2024.1361309. eCollection 2024.
4
An Effective Hybrid Deep Learning Model for Single-Channel EEG-Based Subject-Independent Drowsiness Recognition.一种基于单通道脑电图的有效混合深度学习模型用于独立于个体的嗜睡识别
Brain Topogr. 2024 Jan;37(1):1-18. doi: 10.1007/s10548-023-01016-0. Epub 2023 Nov 23.
5
PHND: Pashtu Handwritten Numerals Database and deep learning benchmark.PHND:普什图语手写数字数据库和深度学习基准。
PLoS One. 2020 Sep 2;15(9):e0238423. doi: 10.1371/journal.pone.0238423. eCollection 2020.
6
S-LSTM-ATT: a hybrid deep learning approach with optimized features for emotion recognition in electroencephalogram.S-LSTM-ATT:一种用于脑电图情感识别的具有优化特征的混合深度学习方法。
Health Inf Sci Syst. 2023 Aug 29;11(1):40. doi: 10.1007/s13755-023-00242-x. eCollection 2023 Dec.
7
Prevalence and risk factors analysis of postpartum depression at early stage using hybrid deep learning model.采用混合深度学习模型分析早期产后抑郁的患病率及危险因素。
Sci Rep. 2024 Feb 24;14(1):4533. doi: 10.1038/s41598-024-54927-8.
8
An Investigation of Deep Learning Models for EEG-Based Emotion Recognition.基于脑电图的情绪识别深度学习模型研究
Front Neurosci. 2020 Dec 23;14:622759. doi: 10.3389/fnins.2020.622759. eCollection 2020.
9
End-to-end multimodal clinical depression recognition using deep neural networks: A comparative analysis.端到端使用深度神经网络进行多模态临床抑郁症识别:比较分析。
Comput Methods Programs Biomed. 2021 Nov;211:106433. doi: 10.1016/j.cmpb.2021.106433. Epub 2021 Sep 28.
10
A Novel Gait Phase Recognition Method Based on DPF-LSTM-CNN Using Wearable Inertial Sensors.基于穿戴式惯性传感器的 DPF-LSTM-CNN 的新型步态相位识别方法。
Sensors (Basel). 2023 Jun 26;23(13):5905. doi: 10.3390/s23135905.

引用本文的文献

1
MediScan: A Framework of U-Health and Prognostic AI Assessment on Medical Imaging.医学扫描:医学影像中U健康与预后人工智能评估框架
J Imaging. 2024 Dec 13;10(12):322. doi: 10.3390/jimaging10120322.

本文引用的文献

1
Pashtu Language Digits Dataset.普什图语数字数据集。
Data Brief. 2022 Oct 26;45:108701. doi: 10.1016/j.dib.2022.108701. eCollection 2022 Dec.
2
On evaluation metrics for medical applications of artificial intelligence.人工智能在医学应用中的评估指标。
Sci Rep. 2022 Apr 8;12(1):5979. doi: 10.1038/s41598-022-09954-8.
3
The efficacy of deep learning based LSTM model in forecasting the outbreak of contagious diseases.基于深度学习的长短期记忆模型在预测传染病爆发方面的功效。
Infect Dis Model. 2022 Mar;7(1):170-183. doi: 10.1016/j.idm.2021.12.005. Epub 2021 Dec 28.
4
Digital selection and analogue amplification coexist in a cortex-inspired silicon circuit.数字选择与模拟放大共存于一个受皮层启发的硅电路中。
Nature. 2000 Jun 22;405(6789):947-51. doi: 10.1038/35016072.