Department of Computer Science, University of Washington, Seattle, WA 98195; AT&TBell Laboratories, Murray Hill, NJ 07974.
IEEE Trans Pattern Anal Mach Intell. 1987 Feb;9(2):274-88. doi: 10.1109/tpami.1987.4767901.
We describe the current state of a system that recognizes printed text of various fonts and sizes for the Roman alphabet. The system combines several techniques in order to improve the overall recognition rate. Thinning and shape extraction are performed directly on a graph of the run-length encoding of a binary image. The resulting strokes and other shapes are mapped, using a shape-clustering approach, into binary features which are then fed into a statistical Bayesian classifier. Large-scale trials have shown better than 97 percent top choice correct performance on mixtures of six dissimilar fonts, and over 99 percent on most single fonts, over a range of point sizes. Certain remaining confusion classes are disambiguated through contour analysis, and characters suspected of being merged are broken and reclassified. Finally, layout and linguistic context are applied. The results are illustrated by sample pages.
我们描述了一个能够识别各种字体和大小的罗马字母印刷文本的系统的现状。该系统结合了多种技术,以提高整体识别率。细化和形状提取直接在二值图像的游程编码的图上执行。将得到的笔画和其他形状使用形状聚类方法映射到二进制特征中,然后将其输入到统计贝叶斯分类器中。大规模试验表明,在混合了六种不同字体的情况下,最佳选择的正确率超过 97%,在大多数单一字体的情况下,超过 99%,并且涵盖了各种字号。通过轮廓分析来区分某些剩余的混淆类别,并将可疑合并的字符进行拆分和重新分类。最后,应用布局和语言上下文。通过示例页面来说明结果。