使用隐马尔可夫模型和统计语言模型对手写文本进行离线识别。

Offline recognition of unconstrained handwritten texts using HMMs and statistical language models.

作者信息

Vinciarelli Alessandro, Bengio Samy, Bunke Horst

机构信息

Dalle Molle Institute for Perceptual Artificial Intelligence, Rue du Simplon 4, 1920 Martigny, Switzerland.

出版信息

IEEE Trans Pattern Anal Mach Intell. 2004 Jun;26(6):709-20. doi: 10.1109/TPAMI.2004.14.

DOI:10.1109/TPAMI.2004.14

PMID:18579932

Abstract

This paper presents a system for the offline recognition of large vocabulary unconstrained handwritten texts. The only assumption made about the data is that it is written in English. This allows the application of Statistical Language Models in order to improve the performance of our system. Several experiments have been performed using both single and multiple writer data. Lexica of variable size (from 10,000 to 50,000 words) have been used. The use of language models is shown to improve the accuracy of the system (when the lexicon contains 50,000 words, the error rate is reduced by approximately 50 percent for single writer data and by approximately 25 percent for multiple writer data). Our approach is described in detail and compared with other methods presented in the literature to deal with the same problem. An experimental setup to correctly deal with unconstrained text recognition is proposed.

摘要

本文提出了一种用于离线识别大词汇量无约束手写文本的系统。关于数据所做的唯一假设是文本用英文书写。这使得能够应用统计语言模型来提高我们系统的性能。使用单作者数据和多作者数据进行了多项实验。使用了不同大小的词汇表（从10000个单词到50000个单词）。结果表明，语言模型的使用提高了系统的准确性（当词汇表包含50000个单词时，单作者数据的错误率降低了约50%，多作者数据的错误率降低了约25%）。详细描述了我们的方法，并与文献中提出的处理相同问题的其他方法进行了比较。提出了一种正确处理无约束文本识别的实验设置。

相似文献

Offline recognition of unconstrained handwritten texts using HMMs and statistical language models.

IEEE Trans Pattern Anal Mach Intell. 2004 Jun;26(6):709-20. doi: 10.1109/TPAMI.2004.14.

Recognition and verification of unconstrained handwritten words.

IEEE Trans Pattern Anal Mach Intell. 2005 Oct;27(10):1509-22. doi: 10.1109/TPAMI.2005.207.

A practical approach for writer-dependent symbol recognition using a writer-independent symbol recognizer.

IEEE Trans Pattern Anal Mach Intell. 2007 Nov;29(11):1917-26. doi: 10.1109/TPAMI.2007.1109.

Offline loop investigation for handwriting analysis.

IEEE Trans Pattern Anal Mach Intell. 2009 Feb;31(2):193-209. doi: 10.1109/TPAMI.2008.68.

A scale space approach for automatically segmenting words from historical handwritten documents.

IEEE Trans Pattern Anal Mach Intell. 2005 Aug;27(8):1212-25. doi: 10.1109/TPAMI.2005.150.

An approach to offline handwritten Chinese character recognition based on segment evaluation of adaptive duration.

J Zhejiang Univ Sci. 2004 Nov;5(11):1392-7. doi: 10.1631/jzus.2004.1392.

A novel document ranking method using the discrete cosine transform.

IEEE Trans Pattern Anal Mach Intell. 2005 Jan;27(1):130-5. doi: 10.1109/TPAMI.2005.2.

Offline geometric parameters for automatic signature verification using fixed-point arithmetic.

IEEE Trans Pattern Anal Mach Intell. 2005 Jun;27(6):993-7. doi: 10.1109/TPAMI.2005.125.

Style context with second-order statistics.

IEEE Trans Pattern Anal Mach Intell. 2005 Jan;27(1):14-22. doi: 10.1109/TPAMI.2005.19.

Machine printed text and handwriting identification in noisy document images.

IEEE Trans Pattern Anal Mach Intell. 2004 Mar;26(3):337-53. doi: 10.1109/TPAMI.2004.1262324.

引用本文的文献

A Pix2Pix Architecture for Complete Offline Handwritten Text Normalization.

Sensors (Basel). 2024 Jun 16;24(12):3892. doi: 10.3390/s24123892.

Attention-Based Fully Gated CNN-BGRU for Russian Handwritten Text.

J Imaging. 2020 Dec 18;6(12):141. doi: 10.3390/jimaging6120141.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

使用隐马尔可夫模型和统计语言模型对手写文本进行离线识别。

Offline recognition of unconstrained handwritten texts using HMMs and statistical language models.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献