高效的免学习关键词识别

Efficient Learning-Free Keyword Spotting.

作者信息

Retsinas George, Louloudis Georgios, Stamatopoulos Nikolaos, Gatos Basilis

出版信息

IEEE Trans Pattern Anal Mach Intell. 2019 Jul;41(7):1587-1600. doi: 10.1109/TPAMI.2018.2845880. Epub 2018 Jun 11.

DOI:10.1109/TPAMI.2018.2845880

Abstract

In this article, a method for segmentation-based learning-free Query by Example (QbE) keyword spotting on handwritten documents is proposed. The method consists of three steps, namely preprocessing, feature extraction and matching, which address critical variations of text images (e.g., skew, translation, different writing styles). During the feature extraction step, a sequence of descriptors is generated using a combination of a zoning scheme and a novel appearance descriptor, referred as modified Projections of Oriented Gradients. The preprocessing step, which includes contrast normalization and main-zone detection, aims to overcome the shortcomings of the appearance descriptor. Moreover, an uneven zoning scheme is introduced by applying a denser zoning only on query images for a more detailed representation. This leads to a significant reduction in storage requirements of a document collection. The distance between the query and word sequences is efficiently computed by the proposed Selective Matching algorithm. This algorithm is further extended to handle an augmented set of images originating from a single query image. The efficiency of the proposed method is demonstrated by experimentation conducted on seven publicly available datasets. In these experiments, the proposed method significantly outperforms all state-of-the-art learning-free techniques.

摘要

本文提出了一种基于分割的、无需学习的手写文档示例查询（QbE）关键词识别方法。该方法包括预处理、特征提取和匹配三个步骤，可解决文本图像的关键变化（如倾斜、平移、不同书写风格）。在特征提取步骤中，使用分区方案和一种新颖的外观描述符（称为改进的定向梯度投影）相结合的方式生成一系列描述符。预处理步骤包括对比度归一化和主区域检测，旨在克服外观描述符的缺点。此外，通过仅对查询图像应用更密集的分区引入不均匀分区方案，以实现更详细的表示。这显著降低了文档集合的存储需求。通过所提出的选择性匹配算法有效地计算查询与单词序列之间的距离。该算法进一步扩展以处理源自单个查询图像的增强图像集。在七个公开可用数据集上进行的实验证明了所提出方法的有效性。在这些实验中，所提出的方法显著优于所有现有的无需学习的技术。