• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种用于完全离线手写文本规范化的Pix2Pix架构。

A Pix2Pix Architecture for Complete Offline Handwritten Text Normalization.

作者信息

Barreiro-Garrido Alvaro, Ruiz-Parrado Victoria, Moreno A Belen, Velez Jose F

机构信息

Higher Technical School of Computer Engineering, Universidad Rey Juan Carlos, c/Tulipan sn, Mostoles, 28922 Madrid, Spain.

出版信息

Sensors (Basel). 2024 Jun 16;24(12):3892. doi: 10.3390/s24123892.

DOI:10.3390/s24123892
PMID:38931676
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11207351/
Abstract

In the realm of offline handwritten text recognition, numerous normalization algorithms have been developed over the years to serve as preprocessing steps prior to applying automatic recognition models to handwritten text scanned images. These algorithms have demonstrated effectiveness in enhancing the overall performance of recognition architectures. However, many of these methods rely heavily on heuristic strategies that are not seamlessly integrated with the recognition architecture itself. This paper introduces the use of a Pix2Pix trainable model, a specific type of conditional generative adversarial network, as the method to normalize handwritten text images. Also, this algorithm can be seamlessly integrated as the initial stage of any deep learning architecture designed for handwritten recognition tasks. All of this facilitates training the normalization and recognition components as a unified whole, while still maintaining some interpretability of each module. Our proposed normalization approach learns from a blend of heuristic transformations applied to text images, aiming to mitigate the impact of intra-personal handwriting variability among different writers. As a result, it achieves slope and slant normalizations, alongside other conventional preprocessing objectives, such as normalizing the size of text ascenders and descenders. We will demonstrate that the proposed architecture replicates, and in certain cases surpasses, the results of a widely used heuristic algorithm across two metrics and when integrated as the first step of a deep recognition architecture.

摘要

在离线手写文本识别领域,多年来已经开发了许多归一化算法,以便在将自动识别模型应用于手写文本扫描图像之前作为预处理步骤。这些算法在提高识别架构的整体性能方面已证明是有效的。然而,这些方法中的许多都严重依赖启发式策略,而这些策略并未与识别架构本身无缝集成。本文介绍了使用Pix2Pix可训练模型(一种特定类型的条件生成对抗网络)作为归一化手写文本图像的方法。此外,该算法可以无缝集成到为手写识别任务设计的任何深度学习架构的初始阶段。所有这些都有助于将归一化和识别组件作为一个统一的整体进行训练,同时仍保持每个模块的一定可解释性。我们提出的归一化方法从应用于文本图像的启发式变换的混合中学习,旨在减轻不同作者之间个人手写变化的影响。结果,它实现了倾斜和斜度归一化,以及其他传统的预处理目标,例如归一化文本上伸部和下伸部的大小。我们将证明,所提出的架构在两个指标上复制了(并且在某些情况下超过了)一种广泛使用的启发式算法的结果,并且当作为深度识别架构的第一步集成时也是如此。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3dd3/11207351/3a4f1164dd70/sensors-24-03892-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3dd3/11207351/b6c6155af480/sensors-24-03892-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3dd3/11207351/f0c28f2c2238/sensors-24-03892-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3dd3/11207351/3cc1ff9ac5b5/sensors-24-03892-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3dd3/11207351/f3fe796a4dbe/sensors-24-03892-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3dd3/11207351/6804f252c816/sensors-24-03892-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3dd3/11207351/3a4f1164dd70/sensors-24-03892-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3dd3/11207351/b6c6155af480/sensors-24-03892-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3dd3/11207351/f0c28f2c2238/sensors-24-03892-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3dd3/11207351/3cc1ff9ac5b5/sensors-24-03892-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3dd3/11207351/f3fe796a4dbe/sensors-24-03892-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3dd3/11207351/6804f252c816/sensors-24-03892-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3dd3/11207351/3a4f1164dd70/sensors-24-03892-g006.jpg

相似文献

1
A Pix2Pix Architecture for Complete Offline Handwritten Text Normalization.一种用于完全离线手写文本规范化的Pix2Pix架构。
Sensors (Basel). 2024 Jun 16;24(12):3892. doi: 10.3390/s24123892.
2
Improving offline handwritten text recognition with hybrid HMM/ANN models.利用混合 HMM/ANN 模型提高离线手写文字识别。
IEEE Trans Pattern Anal Mach Intell. 2011 Apr;33(4):767-79. doi: 10.1109/TPAMI.2010.141.
3
Generative adversarial network based adaptive data augmentation for handwritten Arabic text recognition.基于生成对抗网络的自适应数据增强用于手写阿拉伯文本识别。
PeerJ Comput Sci. 2022 Jan 25;8:e861. doi: 10.7717/peerj-cs.861. eCollection 2022.
4
Enhancement of handwritten text recognition using AI-based hybrid approach.基于人工智能的混合方法对手写文本识别的增强。
MethodsX. 2024 Mar 10;12:102654. doi: 10.1016/j.mex.2024.102654. eCollection 2024 Jun.
5
Kurdish Handwritten character recognition using deep learning techniques.基于深度学习技术的库尔德手写字符识别。
Gene Expr Patterns. 2022 Dec;46:119278. doi: 10.1016/j.gep.2022.119278. Epub 2022 Oct 3.
6
Automatic normalized digital color staining in the recognition of abnormal blood cells using generative adversarial networks.使用生成对抗网络自动进行异常血细胞的归一化数字染色识别。
Comput Methods Programs Biomed. 2023 Oct;240:107629. doi: 10.1016/j.cmpb.2023.107629. Epub 2023 May 30.
7
Fast writer adaptation with style extractor network for handwritten text recognition.基于风格提取器网络的快速书写者自适应的手写文字识别。
Neural Netw. 2022 Mar;147:42-52. doi: 10.1016/j.neunet.2021.12.002. Epub 2021 Dec 9.
8
Content and Style Aware Generation of Text-Line Images for Handwriting Recognition.内容与风格感知的文本行图像生成技术及其在手写识别中的应用
IEEE Trans Pattern Anal Mach Intell. 2022 Dec;44(12):8846-8860. doi: 10.1109/TPAMI.2021.3122572. Epub 2022 Nov 7.
9
Leveraging ShuffleNet transfer learning to enhance handwritten character recognition.利用 ShuffleNet 迁移学习来增强手写字符识别。
Gene Expr Patterns. 2022 Sep;45:119263. doi: 10.1016/j.gep.2022.119263. Epub 2022 Jul 16.
10
A Novel GAN-Based Synthesis Method for In-Air Handwritten Words.基于新型 GAN 的空中手写文字合成方法。
Sensors (Basel). 2020 Nov 16;20(22):6548. doi: 10.3390/s20226548.

引用本文的文献

1
Text Font Correction and Alignment Method for Scene Text Recognition.用于场景文本识别的文本字体校正与对齐方法
Sensors (Basel). 2024 Dec 11;24(24):7917. doi: 10.3390/s24247917.

本文引用的文献

1
DE-GAN: A Conditional Generative Adversarial Network for Document Enhancement.DE-GAN:一种用于文档增强的条件生成对抗网络。
IEEE Trans Pattern Anal Mach Intell. 2022 Mar;44(3):1180-1191. doi: 10.1109/TPAMI.2020.3022406. Epub 2022 Feb 3.
2
Improving offline handwritten text recognition with hybrid HMM/ANN models.利用混合 HMM/ANN 模型提高离线手写文字识别。
IEEE Trans Pattern Anal Mach Intell. 2011 Apr;33(4):767-79. doi: 10.1109/TPAMI.2010.141.
3
A novel connectionist system for unconstrained handwriting recognition.
一种用于无约束手写识别的新型连接主义系统。
IEEE Trans Pattern Anal Mach Intell. 2009 May;31(5):855-68. doi: 10.1109/TPAMI.2008.137.
4
Offline recognition of unconstrained handwritten texts using HMMs and statistical language models.使用隐马尔可夫模型和统计语言模型对手写文本进行离线识别。
IEEE Trans Pattern Anal Mach Intell. 2004 Jun;26(6):709-20. doi: 10.1109/TPAMI.2004.14.
5
Image quality assessment: from error visibility to structural similarity.图像质量评估:从误差可见性到结构相似性。
IEEE Trans Image Process. 2004 Apr;13(4):600-12. doi: 10.1109/tip.2003.819861.