Liu Qin, Xu Zhenlin, Bertasius Gedas, Niethammer Marc
University of North Carolina at Chapel Hill.
Proc IEEE Int Conf Comput Vis. 2023 Oct;2023:22233-22243. doi: 10.1109/iccv51070.2023.02037.
Click-based interactive image segmentation aims at extracting objects with a limited user clicking. A hierarchical backbone is the architecture for current methods. Recently, the plain, non-hierarchical Vision Transformer (ViT) has emerged as a competitive backbone for dense prediction tasks. This design allows the original ViT to be a foundation model that can be finetuned for downstream tasks without redesigning a hierarchical backbone for pretraining. Although this design is simple and has been proven effective, it has not yet been explored for interactive image segmentation. To fill this gap, we propose SimpleClick, the first interactive segmentation method that leverages a plain backbone. Based on the plain backbone, we introduce a symmetric patch embedding layer that encodes clicks into the backbone with minor modifications to the backbone itself. With the plain backbone pretrained as a masked autoencoder (MAE), SimpleClick achieves state-of-the-art performance. Remarkably, our method achieves NoC@90 on SBD, improving over the previous best result. Extensive evaluation on medical images demonstrates the generalizability of our method. We provide a detailed computational analysis, highlighting the suitability of our method as a practical annotation tool.
基于点击的交互式图像分割旨在通过有限的用户点击来提取对象。分层主干是当前方法所采用的架构。最近,简单的、非分层的视觉Transformer(ViT)已成为密集预测任务的一种有竞争力的主干。这种设计使得原始的ViT成为一个基础模型,可针对下游任务进行微调,而无需为预训练重新设计分层主干。尽管这种设计简单且已被证明有效,但尚未在交互式图像分割中得到探索。为了填补这一空白,我们提出了SimpleClick,这是第一种利用简单主干的交互式分割方法。基于简单主干,我们引入了一个对称补丁嵌入层,通过对主干本身进行微小修改将点击编码到主干中。通过将简单主干预训练为掩码自动编码器(MAE),SimpleClick取得了领先的性能。值得注意的是,我们的方法在SBD上实现了90%的无点击准确率(NoC@90),比之前的最佳结果有所提高。对医学图像的广泛评估证明了我们方法的通用性。我们提供了详细的计算分析,突出了我们的方法作为一种实用注释工具的适用性。
Proc IEEE Int Conf Comput Vis. 2023-10
IEEE J Biomed Health Inform. 2024-4-24
Front Med (Lausanne). 2023-3-9
IEEE Trans Neural Netw Learn Syst. 2024-12
IEEE Trans Neural Netw Learn Syst. 2025-3
Phys Med Biol. 2024-2-5
Quant Imaging Med Surg. 2025-6-6
Med Image Comput Comput Assist Interv. 2024-10
Sensors (Basel). 2025-1-17
Bioengineering (Basel). 2023-11-2
Nat Commun. 2023-11-25
Med Image Anal. 2023-10
IEEE Trans Pattern Anal Mach Intell. 2023-7
Med Image Anal. 2017-7-26
Annu Rev Biomed Eng. 2017-6-21
IEEE Trans Pattern Anal Mach Intell. 2006-11