Cheema Musa Dildar Ahmed, Shaiq Mohammad Daniyal, Mirza Farhaan, Kamal Ali, Naeem M Asif
Department of Artificial Intelligence and Data Science, National University of Computer and Emerging Sciences, Islamabad, Pakistan.
School of Computer, Engineering and Mathematical Sciences, Auckland University of Technology, Auckland, New Zealand.
PeerJ Comput Sci. 2024 Apr 29;10:e1964. doi: 10.7717/peerj-cs.1964. eCollection 2024.
In the realm of digitizing written content, the challenges posed by low-resource languages are noteworthy. These languages, often lacking in comprehensive linguistic resources, require specialized attention to develop robust systems for accurate optical character recognition (OCR). This article addresses the significance of focusing on such languages and introduces ViLanOCR, an innovative bilingual OCR system tailored for Urdu and English. Unlike existing systems, which struggle with the intricacies of low-resource languages, ViLanOCR leverages advanced multilingual transformer-based language models to achieve superior performances. The proposed approach is evaluated using the character error rate (CER) metric and achieves state-of-the-art results on the Urdu UHWR dataset, with a CER of 1.1%. The experimental results demonstrate the effectiveness of the proposed approach, surpassing state of the-art baselines in Urdu handwriting digitization.
在书面内容数字化领域,低资源语言带来的挑战值得关注。这些语言通常缺乏全面的语言资源,需要特别关注以开发强大的系统来进行准确的光学字符识别(OCR)。本文阐述了关注此类语言的重要性,并介绍了ViLanOCR,这是一种专为乌尔都语和英语量身定制的创新型双语OCR系统。与现有系统不同,现有系统在处理低资源语言的复杂性方面存在困难,而ViLanOCR利用基于多语言Transformer的先进语言模型来实现卓越性能。所提出的方法使用字符错误率(CER)指标进行评估,并在乌尔都语UHWR数据集上取得了领先的结果,字符错误率为1.1%。实验结果证明了所提方法的有效性,在乌尔都语手写数字化方面超越了当前的基准水平。