Ding Xiaoqing, Chen Li, Wu Tao
Electronics Engineering Department, Tsinghua University, Beijing, P.R. China.
IEEE Trans Pattern Anal Mach Intell. 2007 Feb;29(2):195-204. doi: 10.1109/TPAMI.2007.26.
A novel algorithm for font recognition on a single unknown Chinese character, independent of the identity of the character, is proposed in this paper. We employ a wavelet transform on the character image and extract wavelet features from the transformed image. After a Box-Cox transformation and LDA (Linear Discriminant Analysis) process, the discriminating features for font recognition are extracted and classified through a MQDF (Modified Quadric Distance Function) classifier with only one prototype for each font class. Our experiments show that our algorithm can achieve a recognition rate of 90.28 percent on a single unknown character and 99.01 percent if five characters are used for font recognition. Compared with existing methods, all of which are based on a text block, our method can provide a higher recognition rate and is more flexible and robust, since it is based on a single unknown character. Additionally, our method demonstrates that it is possible to extract subtle yet discriminative signals embedded in a much larger noisy background.
本文提出了一种新颖的算法,用于识别单个未知汉字的字体,该算法与汉字的具体身份无关。我们对汉字图像进行小波变换,并从变换后的图像中提取小波特征。经过Box-Cox变换和线性判别分析(LDA)处理后,提取用于字体识别的判别特征,并通过改进的二次距离函数(MQDF)分类器进行分类,每个字体类别仅有一个原型。我们的实验表明,对于单个未知汉字,我们的算法识别率可达90.28%;若使用五个汉字进行字体识别,识别率则为99.01%。与现有的所有基于文本块的方法相比,我们的方法能提供更高的识别率,且更灵活、稳健,因为它基于单个未知汉字。此外,我们的方法表明,在更大的噪声背景中提取嵌入的细微但有判别力的信号是可行的。