Li Lei, Zou Changqing, Zheng Youyi, Su Qingkun, Fu Hongbo, Tai Chiew-Lan
IEEE Trans Vis Comput Graph. 2021 Sep;27(9):3745-3754. doi: 10.1109/TVCG.2020.2987626. Epub 2021 Jul 29.
Sketches in existing large-scale datasets like the recent QuickDraw collection are often stored in a vector format, with strokes consisting of sequentially sampled points. However, most existing sketch recognition methods rasterize vector sketches as binary images and then adopt image classification techniques. In this article, we propose a novel end-to-end single-branch network architecture RNN-Rasterization-CNN (Sketch-R2CNN for short) to fully leverage the vector format of sketches for recognition. Sketch-R2CNN takes a vector sketch as input and uses an RNN for extracting per-point features in the vector space. We then develop a neural line rasterization module to convert the vector sketch and the per-point features to multi-channel point feature maps, which are subsequently fed to a CNN for extracting convolutional features in the pixel space. Our neural line rasterization module is designed in a differentiable way for end-to-end learning. We perform experiments on existing large-scale sketch recognition datasets and show that the RNN-Rasterization design brings consistent improvement over CNN baselines and that Sketch-R2CNN substantially outperforms the state-of-the-art methods.
像最近的快速绘图数据集这样的现有大规模数据集中的草图通常以矢量格式存储,其笔触由顺序采样的点组成。然而,大多数现有的草图识别方法将矢量草图光栅化为二值图像,然后采用图像分类技术。在本文中,我们提出了一种新颖的端到端单分支网络架构RNN - 光栅化 - CNN(简称为Sketch - R2CNN),以充分利用草图的矢量格式进行识别。Sketch - R2CNN以矢量草图作为输入,并使用RNN在矢量空间中提取逐点特征。然后,我们开发了一个神经线条光栅化模块,将矢量草图和逐点特征转换为多通道点特征图,随后将其输入到CNN中,以在像素空间中提取卷积特征。我们的神经线条光栅化模块以可微的方式设计,用于端到端学习。我们在现有的大规模草图识别数据集上进行了实验,结果表明,RNN - 光栅化设计相对于CNN基线带来了持续的改进,并且Sketch - R2CNN显著优于当前的先进方法。