• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使多语言视觉语言变换器适用于低资源乌尔都语光学字符识别(OCR)。

Adapting multilingual vision language transformers for low-resource Urdu optical character recognition (OCR).

作者信息

Cheema Musa Dildar Ahmed, Shaiq Mohammad Daniyal, Mirza Farhaan, Kamal Ali, Naeem M Asif

机构信息

Department of Artificial Intelligence and Data Science, National University of Computer and Emerging Sciences, Islamabad, Pakistan.

School of Computer, Engineering and Mathematical Sciences, Auckland University of Technology, Auckland, New Zealand.

出版信息

PeerJ Comput Sci. 2024 Apr 29;10:e1964. doi: 10.7717/peerj-cs.1964. eCollection 2024.

DOI:10.7717/peerj-cs.1964
PMID:38699211
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11065407/
Abstract

In the realm of digitizing written content, the challenges posed by low-resource languages are noteworthy. These languages, often lacking in comprehensive linguistic resources, require specialized attention to develop robust systems for accurate optical character recognition (OCR). This article addresses the significance of focusing on such languages and introduces ViLanOCR, an innovative bilingual OCR system tailored for Urdu and English. Unlike existing systems, which struggle with the intricacies of low-resource languages, ViLanOCR leverages advanced multilingual transformer-based language models to achieve superior performances. The proposed approach is evaluated using the character error rate (CER) metric and achieves state-of-the-art results on the Urdu UHWR dataset, with a CER of 1.1%. The experimental results demonstrate the effectiveness of the proposed approach, surpassing state of the-art baselines in Urdu handwriting digitization.

摘要

在书面内容数字化领域,低资源语言带来的挑战值得关注。这些语言通常缺乏全面的语言资源,需要特别关注以开发强大的系统来进行准确的光学字符识别(OCR)。本文阐述了关注此类语言的重要性,并介绍了ViLanOCR,这是一种专为乌尔都语和英语量身定制的创新型双语OCR系统。与现有系统不同,现有系统在处理低资源语言的复杂性方面存在困难,而ViLanOCR利用基于多语言Transformer的先进语言模型来实现卓越性能。所提出的方法使用字符错误率(CER)指标进行评估,并在乌尔都语UHWR数据集上取得了领先的结果,字符错误率为1.1%。实验结果证明了所提方法的有效性,在乌尔都语手写数字化方面超越了当前的基准水平。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2f3a/11065407/86bbd10b1afa/peerj-cs-10-1964-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2f3a/11065407/69d226e3909e/peerj-cs-10-1964-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2f3a/11065407/49b0790d4514/peerj-cs-10-1964-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2f3a/11065407/614db14d0cca/peerj-cs-10-1964-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2f3a/11065407/b3fad26172b1/peerj-cs-10-1964-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2f3a/11065407/ce75e6099866/peerj-cs-10-1964-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2f3a/11065407/86bbd10b1afa/peerj-cs-10-1964-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2f3a/11065407/69d226e3909e/peerj-cs-10-1964-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2f3a/11065407/49b0790d4514/peerj-cs-10-1964-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2f3a/11065407/614db14d0cca/peerj-cs-10-1964-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2f3a/11065407/b3fad26172b1/peerj-cs-10-1964-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2f3a/11065407/ce75e6099866/peerj-cs-10-1964-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2f3a/11065407/86bbd10b1afa/peerj-cs-10-1964-g006.jpg

相似文献

1
Adapting multilingual vision language transformers for low-resource Urdu optical character recognition (OCR).使多语言视觉语言变换器适用于低资源乌尔都语光学字符识别(OCR)。
PeerJ Comput Sci. 2024 Apr 29;10:e1964. doi: 10.7717/peerj-cs.1964. eCollection 2024.
2
ET-Network: A novel efficient transformer deep learning model for automated Urdu handwritten text recognition.ET-Network:一种新颖高效的用于自动乌尔都语手写文字识别的变压器深度学习模型。
PLoS One. 2024 May 17;19(5):e0302590. doi: 10.1371/journal.pone.0302590. eCollection 2024.
3
Cursive-Text: A Comprehensive Dataset for End-to-End Urdu Text Recognition in Natural Scene Images.连笔文本:用于自然场景图像中乌尔都语文本端到端识别的综合数据集。
Data Brief. 2020 May 21;31:105749. doi: 10.1016/j.dib.2020.105749. eCollection 2020 Aug.
4
An online multilingual numeral dataset on Devnagari and English languages for pattern recognition research.一个用于模式识别研究的、关于天城文和英语的在线多语言数字数据集。
Data Brief. 2023 Oct 31;51:109743. doi: 10.1016/j.dib.2023.109743. eCollection 2023 Dec.
5
Multilingual character recognition dataset for Moroccan official documents.摩洛哥官方文件的多语言字符识别数据集。
Data Brief. 2023 Dec 13;52:109953. doi: 10.1016/j.dib.2023.109953. eCollection 2024 Feb.
6
Multilingual event extraction for epidemic detection.用于疫情检测的多语言事件提取
Artif Intell Med. 2015 Oct;65(2):131-43. doi: 10.1016/j.artmed.2015.06.005. Epub 2015 Jul 17.
7
Investigating Children's Narrative Abilities in a Chinese and Multilingual Context: Cantonese, Mandarin, Kam and Urdu Adaptations of the Multilingual Assessment Instrument for Narratives (MAIN).在中国及多语言环境中探究儿童的叙事能力:粤语、普通话、加勉语和乌尔都语对多语言叙事评估工具(MAIN)的改编
Front Psychol. 2020 Nov 20;11:573780. doi: 10.3389/fpsyg.2020.573780. eCollection 2020.
8
The ambivalent role of Urdu and English in multilingual Pakistan: a Bourdieusian study.乌尔都语和英语在多语言的巴基斯坦的矛盾角色:一项布迪厄式研究。
Lang Policy. 2023;22(1):25-48. doi: 10.1007/s10993-022-09623-6. Epub 2022 Mar 22.
9
A versatile dataset for intrinsic plagiarism detection, text reuse analysis, and author clustering in Urdu.一个用于乌尔都语中内在抄袭检测、文本重用分析和作者聚类的多功能数据集。
Data Brief. 2023 Nov 26;52:109857. doi: 10.1016/j.dib.2023.109857. eCollection 2024 Feb.
10
Multilingual end-to-end ASR for low-resource Turkic languages with common alphabets.多语言端到端 ASR 用于资源匮乏的具有通用字母表的突厥语。
Sci Rep. 2024 Jun 15;14(1):13835. doi: 10.1038/s41598-024-64848-1.

引用本文的文献

1
A scarce dataset for ancient Arabic handwritten text recognition.用于古代阿拉伯手写文本识别的稀缺数据集。
Data Brief. 2024 Aug 8;56:110813. doi: 10.1016/j.dib.2024.110813. eCollection 2024 Oct.

本文引用的文献

1
An online cursive handwritten medical words recognition system for busy doctors in developing countries for ensuring efficient healthcare service delivery.面向发展中国家忙碌医生的在线手写医学单词识别系统,确保高效的医疗保健服务提供。
Sci Rep. 2022 Mar 4;12(1):3601. doi: 10.1038/s41598-022-07571-z.
2
India achieves WHO recommended doctor population ratio: A call for paradigm shift in public health discourse!印度实现了世界卫生组织建议的医生与人口比例:呼吁公共卫生话语的范式转变!
J Family Med Prim Care. 2018 Sep-Oct;7(5):841-844. doi: 10.4103/jfmpc.jfmpc_218_18.