Sketch-Segformer：基于Transformer的具象和创意草图分割

Sketch-Segformer: Transformer-Based Segmentation for Figurative and Creative Sketches.

作者信息

Zheng Yixiao, Xie Jiyang, Sain Aneeshan, Song Yi-Zhe, Ma Zhanyu

出版信息

IEEE Trans Image Process. 2023;32:4595-4609. doi: 10.1109/TIP.2023.3302521. Epub 2023 Aug 16.

DOI:10.1109/TIP.2023.3302521

Abstract

Sketch is a well-researched topic in the vision community by now. Sketch semantic segmentation in particular, serves as a fundamental step towards finer-level sketch interpretation. Recent works use various means of extracting discriminative features from sketches and have achieved considerable improvements on segmentation accuracy. Common approaches for this include attending to the sketch-image as a whole, its stroke-level representation or the sequence information embedded in it. However, they mostly focus on only a part of such multi-facet information. In this paper, we for the first time demonstrate that there is complementary information to be explored across all these three facets of sketch data, and that segmentation performance consequently benefits as a result of such exploration of sketch-specific information. Specifically, we propose the Sketch-Segformer, a transformer-based framework for sketch semantic segmentation that inherently treats sketches as stroke sequences other than pixel-maps. In particular, Sketch-Segformer introduces two types of self-attention modules having similar structures that work with different receptive fields (i.e., whole sketch or individual stroke). The order embedding is then further synergized with spatial embeddings learned from the entire sketch as well as localized stroke-level information. Extensive experiments show that our sketch-specific design is not only able to obtain state-of-the-art performance on traditional figurative sketches (such as SPG, SketchSeg-150K datasets), but also performs well on creative sketches that do not conform to conventional object semantics (CreativeSketch dataset) thanks for our usage of multi-facet sketch information. Ablation studies, visualizations, and invariance tests further justifies our design choice and the effectiveness of Sketch-Segformer. Codes are available at https://github.com/PRIS-CV/Sketch-SF.

摘要

到目前为止，草图是视觉领域中一个经过充分研究的主题。特别是草图语义分割，是迈向更精细层次草图解释的基础步骤。近期的工作使用了各种从草图中提取判别特征的方法，并在分割精度上取得了显著提升。常见的方法包括将草图图像作为一个整体来处理、其笔触级别的表示或其中嵌入的序列信息。然而，它们大多只关注了这些多方面信息中的一部分。在本文中，我们首次证明在草图数据的所有这三个方面都存在有待探索的互补信息，并且由于对草图特定信息的这种探索，分割性能因此受益。具体而言，我们提出了Sketch-Segformer，这是一个基于Transformer的草图语义分割框架，它本质上把草图当作笔触序列而非像素映射来处理。特别地，Sketch-Segformer引入了两种具有相似结构但不同感受野（即整个草图或单个笔触）的自注意力模块。然后，顺序嵌入与从整个草图以及局部笔触级别信息中学习到的空间嵌入进一步协同。大量实验表明，我们针对草图的特定设计不仅能够在传统具象草图（如SPG、SketchSeg-150K数据集）上获得当前最优的性能，而且由于我们对多方面草图信息的使用，在不符合传统对象语义的创意草图（CreativeSketch数据集）上也表现良好。消融研究、可视化和不变性测试进一步证明了我们的设计选择以及Sketch-Segformer的有效性。代码可在https://github.com/PRIS-CV/Sketch-SF获取。

相似文献

Sketch-Segformer: Transformer-Based Segmentation for Figurative and Creative Sketches.Sketch-Segformer：基于Transformer的具象和创意草图分割

IEEE Trans Image Process. 2023;32:4595-4609. doi: 10.1109/TIP.2023.3302521. Epub 2023 Aug 16.

CreativeSeg: Semantic Segmentation of Creative Sketches.创意分割：创意草图的语义分割

IEEE Trans Image Process. 2024;33:2266-2278. doi: 10.1109/TIP.2024.3374196. Epub 2024 Mar 21.

Exploring Local Detail Perception for Scene Sketch Semantic Segmentation.探索场景素描语义分割中的局部细节感知。

IEEE Trans Image Process. 2022;31:1447-1461. doi: 10.1109/TIP.2022.3142511. Epub 2022 Jan 27.

One Sketch for All: One-Shot Personalized Sketch Segmentation.一图多用：单图个性化草图分割。

IEEE Trans Image Process. 2022;31:2673-2682. doi: 10.1109/TIP.2022.3160076. Epub 2022 Mar 28.

Multigraph Transformer for Free-Hand Sketch Recognition.多图变换模型在自由手绘草图识别中的应用。

IEEE Trans Neural Netw Learn Syst. 2022 Oct;33(10):5150-5161. doi: 10.1109/TNNLS.2021.3069230. Epub 2022 Oct 5.

Sketch-a-Segmenter: Sketch-based Photo Segmenter Generation.Sketch-a-Segmenter：基于草图的照片分割器生成

IEEE Trans Image Process. 2020 Oct 7;PP. doi: 10.1109/TIP.2020.3028292.

Deep Common Semantic Space Embedding for Sketch-Based 3D Model Retrieval.用于基于草图的3D模型检索的深度通用语义空间嵌入

Entropy (Basel). 2019 Apr 4;21(4):369. doi: 10.3390/e21040369.

Toward Deep Universal Sketch Perceptual Grouper.迈向深度通用草图感知分组器。

IEEE Trans Image Process. 2019 Jul;28(7):3219-3231. doi: 10.1109/TIP.2019.2895155. Epub 2019 Jan 25.

Context awareness based Sketch-DeepNet architecture for hand-drawn sketches classification and recognition in AIoT.用于人工智能物联网中手绘草图分类与识别的基于上下文感知的Sketch-DeepNet架构

PeerJ Comput Sci. 2023 Apr 27;9:e1186. doi: 10.7717/peerj-cs.1186. eCollection 2023.

An Efficient Transformer Based on Global and Local Self-Attention for Face Photo-Sketch Synthesis.一种基于全局和局部自注意力机制的高效Transformer用于面部照片-草图合成。

IEEE Trans Image Process. 2023;32:483-495. doi: 10.1109/TIP.2022.3229614. Epub 2022 Dec 30.

Sketch-Segformer：基于Transformer的具象和创意草图分割

Sketch-Segformer: Transformer-Based Segmentation for Figurative and Creative Sketches.

作者信息

Zheng Yixiao, Xie Jiyang, Sain Aneeshan, Song Yi-Zhe, Ma Zhanyu

出版信息

IEEE Trans Image Process. 2023;32:4595-4609. doi: 10.1109/TIP.2023.3302521. Epub 2023 Aug 16.

DOI:10.1109/TIP.2023.3302521

PMID:37561619

Abstract

摘要

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

Sketch-Segformer：基于Transformer的具象和创意草图分割

Sketch-Segformer: Transformer-Based Segmentation for Figurative and Creative Sketches.

作者信息

出版信息

相似文献

Sketch-Segformer：基于Transformer的具象和创意草图分割

Sketch-Segformer: Transformer-Based Segmentation for Figurative and Creative Sketches.

作者信息

出版信息

相似文献