• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用深度学习模型在相机拍摄的普什图文文档图像中进行普什图文脚本和图形检测。

Pashto script and graphics detection in camera captured Pashto document images using deep learning model.

作者信息

Bahadar Khan, Ahmad Riaz, Aurangzeb Khursheed, Muhammad Siraj, Ullah Khalil, Hussain Ibrar, Syed Ikram, Shahid Anwar Muhammad

机构信息

Department of Computer Science, Shaheed Benazir Bhutto University, Sheringal, Pakistan.

Department of Computer Engineering, College of Computer and Information Sciences, King Saud University, Riyadh, Saudi Arabia.

出版信息

PeerJ Comput Sci. 2024 Jul 26;10:e2089. doi: 10.7717/peerj-cs.2089. eCollection 2024.

DOI:10.7717/peerj-cs.2089
PMID:39145223
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11323099/
Abstract

Layout analysis is the main component of a typical Document Image Analysis (DIA) system and plays an important role in pre-processing. However, regarding the Pashto language, the document images have not been explored so far. This research, for the first time, examines Pashto text along with graphics and proposes a deep learning-based classifier that can detect Pashto text and graphics per document. Another notable contribution of this research is the creation of a real dataset, which contains more than 1,000 images of the Pashto documents captured by a camera. For this dataset, we applied the convolution neural network (CNN) following a deep learning technique. Our intended method is based on the development of the advanced and classical variant of Faster R-CNN called Single-Shot Detector (SSD). The evaluation was performed by examining the 300 images from the test set. Through this way, we achieved a mean average precision (mAP) of 84.90%.

摘要

版面分析是典型文档图像分析(DIA)系统的主要组成部分,在预处理中起着重要作用。然而,就普什图语而言,迄今为止尚未对文档图像进行过探索。本研究首次对普什图文本文档及图形进行了研究,并提出了一种基于深度学习的分类器,该分类器可以检测每份文档中的普什图文本文档及图形。本研究的另一个显著贡献是创建了一个真实数据集,其中包含通过相机拍摄的1000多张普什图语文档图像。对于这个数据集,我们采用深度学习技术应用了卷积神经网络(CNN)。我们预期的方法基于名为单阶段检测器(SSD)的Faster R-CNN高级经典变体的开发。通过检查测试集中的300张图像进行评估。通过这种方式,我们获得了84.90%的平均精度均值(mAP)。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/43d7/11323099/7c83efcb700f/peerj-cs-10-2089-g012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/43d7/11323099/33ac7845f293/peerj-cs-10-2089-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/43d7/11323099/91b4d3144077/peerj-cs-10-2089-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/43d7/11323099/9c73acf63e81/peerj-cs-10-2089-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/43d7/11323099/c98689bd6b72/peerj-cs-10-2089-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/43d7/11323099/a3b5f37c64f3/peerj-cs-10-2089-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/43d7/11323099/8fc21bb2f641/peerj-cs-10-2089-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/43d7/11323099/d6750d066aed/peerj-cs-10-2089-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/43d7/11323099/3be54c58ad07/peerj-cs-10-2089-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/43d7/11323099/8655db36a9c9/peerj-cs-10-2089-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/43d7/11323099/471ec1611604/peerj-cs-10-2089-g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/43d7/11323099/13ba5367b121/peerj-cs-10-2089-g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/43d7/11323099/7c83efcb700f/peerj-cs-10-2089-g012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/43d7/11323099/33ac7845f293/peerj-cs-10-2089-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/43d7/11323099/91b4d3144077/peerj-cs-10-2089-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/43d7/11323099/9c73acf63e81/peerj-cs-10-2089-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/43d7/11323099/c98689bd6b72/peerj-cs-10-2089-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/43d7/11323099/a3b5f37c64f3/peerj-cs-10-2089-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/43d7/11323099/8fc21bb2f641/peerj-cs-10-2089-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/43d7/11323099/d6750d066aed/peerj-cs-10-2089-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/43d7/11323099/3be54c58ad07/peerj-cs-10-2089-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/43d7/11323099/8655db36a9c9/peerj-cs-10-2089-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/43d7/11323099/471ec1611604/peerj-cs-10-2089-g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/43d7/11323099/13ba5367b121/peerj-cs-10-2089-g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/43d7/11323099/7c83efcb700f/peerj-cs-10-2089-g012.jpg

相似文献

1
Pashto script and graphics detection in camera captured Pashto document images using deep learning model.使用深度学习模型在相机拍摄的普什图文文档图像中进行普什图文脚本和图形检测。
PeerJ Comput Sci. 2024 Jul 26;10:e2089. doi: 10.7717/peerj-cs.2089. eCollection 2024.
2
Pashto offensive language detection: a benchmark dataset and monolingual Pashto BERT.普什图语冒犯性语言检测:一个基准数据集和单语普什图语BERT
PeerJ Comput Sci. 2023 Oct 18;9:e1617. doi: 10.7717/peerj-cs.1617. eCollection 2023.
3
Pashto Handwritten Invariant Character Trajectory Prediction Using a Customized Deep Learning Technique.使用定制深度学习技术的普什图语手写不变字符轨迹预测
Sensors (Basel). 2023 Jun 30;23(13):6060. doi: 10.3390/s23136060.
4
Recognition of Pashto Handwritten Characters Based on Deep Learning.基于深度学习的普什图文手写字符识别。
Sensors (Basel). 2020 Oct 17;20(20):5884. doi: 10.3390/s20205884.
5
Pashto poetry generation: deep learning with pre-trained transformers for low-resource languages.
PeerJ Comput Sci. 2024 Aug 30;10:e2163. doi: 10.7717/peerj-cs.2163. eCollection 2024.
6
Deep learning-based recognition system for pashto handwritten text: benchmark on PHTI.基于深度学习的普什图语手写文本识别系统:PHTI基准测试
PeerJ Comput Sci. 2024 Mar 27;10:e1925. doi: 10.7717/peerj-cs.1925. eCollection 2024.
7
A Robust Deep-Learning-Based Detector for Real-Time Tomato Plant Diseases and Pests Recognition.基于深度学习的实时番茄病虫害识别稳健探测器。
Sensors (Basel). 2017 Sep 4;17(9):2022. doi: 10.3390/s17092022.
8
Robust Optical Recognition of Cursive Pashto Script Using Scale, Rotation and Location Invariant Approach.使用尺度、旋转和位置不变方法对普什图文草体进行稳健光学识别。
PLoS One. 2015 Sep 14;10(9):e0133648. doi: 10.1371/journal.pone.0133648. eCollection 2015.
9
Agricultural Greenhouses Detection in High-Resolution Satellite Images Based on Convolutional Neural Networks: Comparison of Faster R-CNN, YOLO v3 and SSD.基于卷积神经网络的高分辨率卫星图像中的农业温室检测:Faster R-CNN、YOLO v3和SSD的比较
Sensors (Basel). 2020 Aug 31;20(17):4938. doi: 10.3390/s20174938.
10
Cursive-Text: A Comprehensive Dataset for End-to-End Urdu Text Recognition in Natural Scene Images.连笔文本:用于自然场景图像中乌尔都语文本端到端识别的综合数据集。
Data Brief. 2020 May 21;31:105749. doi: 10.1016/j.dib.2020.105749. eCollection 2020 Aug.

本文引用的文献

1
COVID-19 classification by CCSHNet with deep fusion using transfer learning and discriminant correlation analysis.使用迁移学习和判别相关分析进行深度融合的CCSHNet对COVID-19的分类
Inf Fusion. 2021 Apr;68:131-148. doi: 10.1016/j.inffus.2020.11.005. Epub 2020 Nov 13.
2
Advances in multimodal data fusion in neuroimaging: Overview, challenges, and novel orientation.神经影像学中多模态数据融合的进展:概述、挑战及新方向。
Inf Fusion. 2020 Dec;64:149-187. doi: 10.1016/j.inffus.2020.07.006. Epub 2020 Jul 17.