Suppr超能文献

使用隐马尔可夫模型和统计语言模型对手写文本进行离线识别。

Offline recognition of unconstrained handwritten texts using HMMs and statistical language models.

作者信息

Vinciarelli Alessandro, Bengio Samy, Bunke Horst

机构信息

Dalle Molle Institute for Perceptual Artificial Intelligence, Rue du Simplon 4, 1920 Martigny, Switzerland.

出版信息

IEEE Trans Pattern Anal Mach Intell. 2004 Jun;26(6):709-20. doi: 10.1109/TPAMI.2004.14.

Abstract

This paper presents a system for the offline recognition of large vocabulary unconstrained handwritten texts. The only assumption made about the data is that it is written in English. This allows the application of Statistical Language Models in order to improve the performance of our system. Several experiments have been performed using both single and multiple writer data. Lexica of variable size (from 10,000 to 50,000 words) have been used. The use of language models is shown to improve the accuracy of the system (when the lexicon contains 50,000 words, the error rate is reduced by approximately 50 percent for single writer data and by approximately 25 percent for multiple writer data). Our approach is described in detail and compared with other methods presented in the literature to deal with the same problem. An experimental setup to correctly deal with unconstrained text recognition is proposed.

摘要

本文提出了一种用于离线识别大词汇量无约束手写文本的系统。关于数据所做的唯一假设是文本用英文书写。这使得能够应用统计语言模型来提高我们系统的性能。使用单作者数据和多作者数据进行了多项实验。使用了不同大小的词汇表(从10000个单词到50000个单词)。结果表明,语言模型的使用提高了系统的准确性(当词汇表包含50000个单词时,单作者数据的错误率降低了约50%,多作者数据的错误率降低了约25%)。详细描述了我们的方法,并与文献中提出的处理相同问题的其他方法进行了比较。提出了一种正确处理无约束文本识别的实验设置。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验