IEEE Trans Pattern Anal Mach Intell. 2018 Apr;40(4):849-862. doi: 10.1109/TPAMI.2017.2695539. Epub 2017 Apr 18.
Recent deep learning based approaches have achieved great success on handwriting recognition. Chinese characters are among the most widely adopted writing systems in the world. Previous research has mainly focused on recognizing handwritten Chinese characters. However, recognition is only one aspect for understanding a language, another challenging and interesting task is to teach a machine to automatically write (pictographic) Chinese characters. In this paper, we propose a framework by using the recurrent neural network (RNN) as both a discriminative model for recognizing Chinese characters and a generative model for drawing (generating) Chinese characters. To recognize Chinese characters, previous methods usually adopt the convolutional neural network (CNN) models which require transforming the online handwriting trajectory into image-like representations. Instead, our RNN based approach is an end-to-end system which directly deals with the sequential structure and does not require any domain-specific knowledge. With the RNN system (combining an LSTM and GRU), state-of-the-art performance can be achieved on the ICDAR-2013 competition database. Furthermore, under the RNN framework, a conditional generative model with character embedding is proposed for automatically drawing recognizable Chinese characters. The generated characters (in vector format) are human-readable and also can be recognized by the discriminative RNN model with high accuracy. Experimental results verify the effectiveness of using RNNs as both generative and discriminative models for the tasks of drawing and recognizing Chinese characters.
基于深度学习的方法在手写识别方面取得了巨大的成功。汉字是世界上应用最广泛的书写系统之一。以前的研究主要集中在识别手写汉字上。然而,识别只是理解语言的一个方面,另一个具有挑战性和趣味性的任务是教机器自动书写(象形)汉字。在本文中,我们提出了一个框架,使用递归神经网络(RNN)作为识别汉字的判别模型和绘制(生成)汉字的生成模型。为了识别汉字,以前的方法通常采用卷积神经网络(CNN)模型,该模型需要将在线手写轨迹转换为类似图像的表示。相反,我们基于 RNN 的方法是一个端到端系统,它直接处理序列结构,不需要任何特定于领域的知识。在 RNN 系统(结合 LSTM 和 GRU)中,可以在 ICDAR-2013 竞赛数据库上实现最先进的性能。此外,在 RNN 框架下,提出了一种带有字符嵌入的条件生成模型,用于自动绘制可识别的汉字。生成的字符(向量格式)是人类可读的,也可以被判别性 RNN 模型以高精度识别。实验结果验证了 RNN 作为生成和判别模型在绘制和识别汉字任务中的有效性。