Suppr超能文献

粗掩码引导的交互式目标分割

Coarse Mask Guided Interactive Object Segmentation.

作者信息

Li Jing, Fan Junsong, Wang Yuxi, Yang Yuran, Zhang Zhaoxiang

出版信息

IEEE Trans Image Process. 2023;32:5808-5822. doi: 10.1109/TIP.2023.3322564. Epub 2023 Oct 26.

Abstract

Interactive object segmentation aims to produce object masks with user interactions, such as clicks, bounding boxes, and scribbles. Click point is the most popular interactive cue for its efficiency, and related deep learning methods have attracted lots of interest in recent years. Most works encode click points as gaussian maps and concatenate them with images as the model's input. However, the spatial and semantic information of gaussian maps would be noised through multiple convolution layers and won't be fully exploited by top layers for mask prediction. To pass click information to top layers exactly and efficiently, we propose a coarse mask guided model (CMG) which predicts coarse masks with a coarse module to guide the object mask prediction. Specifically, the coarse module encodes user clicks as query features and enriches their semantic information with backbone features through transformer layers, coarse masks are generated based on the enriched query feature and fed into CMG's decoder. Benefiting from the efficiency of transformer, CMG's coarse module and decoder module are lightweight and computationally efficient, making the interaction process more smooth. Experiments on several segmentation benchmarks demonstrate the effectiveness of our method, and we get new state-of-the-art results compared with previous works.

摘要

交互式目标分割旨在通过用户交互(如点击、边界框和涂鸦)生成目标掩码。点击点因其效率而成为最流行的交互线索,近年来相关的深度学习方法引起了广泛关注。大多数工作将点击点编码为高斯图,并将其与图像连接作为模型的输入。然而,高斯图的空间和语义信息会在多个卷积层中被噪声干扰,并且顶层不会充分利用这些信息进行掩码预测。为了准确且高效地将点击信息传递到顶层,我们提出了一种粗掩码引导模型(CMG),该模型使用一个粗模块预测粗掩码来指导目标掩码预测。具体来说,粗模块将用户点击编码为查询特征,并通过Transformer层利用主干特征丰富其语义信息,基于丰富后的查询特征生成粗掩码并输入到CMG的解码器中。受益于Transformer的效率,CMG的粗模块和解码器模块轻量级且计算高效,使得交互过程更加流畅。在多个分割基准上的实验证明了我们方法的有效性,与之前的工作相比,我们取得了新的最优结果。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验