ICVSLab., Department of Electronic Engineering, Yeungnam University, 280 Daehak-ro, Gyeongsan 38541, Gyeongbuk, Korea.
Department of Electrical Engineering, Pohang University of Science and Technology, Pohang 37673, Gyeongbuk, Korea.
Sensors (Basel). 2022 Mar 19;22(6):2374. doi: 10.3390/s22062374.
Making a new font requires graphical designs for all base characters, and this designing process consumes lots of time and human resources. Especially for languages including a large number of combinations of consonants and vowels, it is a heavy burden to design all such combinations independently. Automatic font generation methods have been proposed to reduce this labor-intensive design problem. Most of the methods are GAN-based approaches, and they are limited to generate the trained fonts. In some previous methods, they used two encoders, one for content, the other for style, but their disentanglement of content and style is not sufficiently effective in generating arbitrary fonts. Arbitrary font generation is a challenging task because learning text and font design separately from given font images is very difficult, where the font images have both text content and font style in each image. In this paper, we propose a new automatic font generation method to solve this disentanglement problem. First, we use two stacked inputs, i.e., images with the same text but different font style as content input and images with the same font style but different text as style input. Second, we propose new consistency losses that force any combination of encoded features of the stacked inputs to have the same values. In our experiments, we proved that our method can extract consistent features of text contents and font styles by separating content and style encoders and this works well for generating unseen font design from a small number of reference font images that are human-designed. Comparing to the previous methods, the font designs generated with our method showed better quality both qualitatively and quantitatively than those with the previous methods for Korean, Chinese, and English characters. e.g., 17.84 lower FID in unseen font compared to other methods.
制作一种新字体需要为所有基本字符进行图形设计,而这个设计过程需要消耗大量的时间和人力资源。特别是对于包含大量辅音和元音组合的语言,独立设计所有这些组合是一个沉重的负担。已经提出了自动字体生成方法来减少这种劳动密集型的设计问题。大多数方法都是基于 GAN 的方法,它们仅限于生成训练过的字体。在之前的一些方法中,他们使用了两个编码器,一个用于内容,另一个用于风格,但它们在生成任意字体方面的内容和风格的解耦效果并不充分。任意字体生成是一项具有挑战性的任务,因为从给定的字体图像中分别学习文本和字体设计非常困难,其中字体图像在每个图像中都具有文本内容和字体样式。在本文中,我们提出了一种新的自动字体生成方法来解决这个解耦问题。首先,我们使用两个堆叠的输入,即具有相同文本但不同字体样式的图像作为内容输入,以及具有相同字体样式但不同文本的图像作为样式输入。其次,我们提出了新的一致性损失,迫使堆叠输入的编码特征的任何组合具有相同的值。在我们的实验中,我们证明了我们的方法可以通过分离内容和风格编码器来提取文本内容和字体样式的一致特征,并且对于从少数人类设计的参考字体图像生成看不见的字体设计效果很好。与之前的方法相比,我们的方法生成的字体设计在质量和数量上都优于之前的方法,对于韩文、中文和英文字符,例如,看不见的字体的 FID 低 17.84。