用于快速视频对象分割的定向深度嵌入与外观学习

Directional Deep Embedding and Appearance Learning for Fast Video Object Segmentation.

作者信息

Yin Yingjie, Xu De, Wang Xingang, Zhang Lei

出版信息

IEEE Trans Neural Netw Learn Syst. 2022 Aug;33(8):3884-3894. doi: 10.1109/TNNLS.2021.3054769. Epub 2022 Aug 3.

DOI:10.1109/TNNLS.2021.3054769

Abstract

Most recent semisupervised video object segmentation (VOS) methods rely on fine-tuning deep convolutional neural networks online using the given mask of the first frame or predicted masks of subsequent frames. However, the online fine-tuning process is usually time-consuming, limiting the practical use of such methods. We propose a directional deep embedding and appearance learning (DDEAL) method, which is free of the online fine-tuning process, for fast VOS. First, a global directional matching module (GDMM), which can be efficiently implemented by parallel convolutional operations, is proposed to learn a semantic pixel-wise embedding as an internal guidance. Second, an effective directional appearance model-based statistics is proposed to represent the target and background on a spherical embedding space for VOS. Equipped with the GDMM and the directional appearance model learning module, DDEAL learns static cues from the labeled first frame and dynamically updates cues of the subsequent frames for object segmentation. Our method exhibits the state-of-the-art VOS performance without using online fine-tuning. Specifically, it achieves a J & F mean score of 74.8% on DAVIS 2017 data set and an overall score G of 71.3% on the large-scale YouTube-VOS data set, while retaining a speed of 25 fps with a single NVIDIA TITAN Xp GPU. Furthermore, our faster version runs 31 fps with only a little accuracy loss.

摘要

最近的大多数半监督视频目标分割（VOS）方法依赖于使用第一帧的给定掩码或后续帧的预测掩码在线微调深度卷积神经网络。然而，在线微调过程通常很耗时，限制了此类方法的实际应用。我们提出了一种用于快速VOS的定向深度嵌入和外观学习（DDEAL）方法，该方法无需在线微调过程。首先，提出了一种全局定向匹配模块（GDMM），它可以通过并行卷积操作有效地实现，以学习语义像素级嵌入作为内部指导。其次，提出了一种基于有效定向外观模型的统计方法，用于在球形嵌入空间上表示VOS的目标和背景。配备GDMM和定向外观模型学习模块，DDEAL从标记的第一帧学习静态线索，并动态更新后续帧的线索以进行目标分割。我们的方法在不使用在线微调的情况下展现出了当前最优的VOS性能。具体而言，它在DAVIS 2017数据集上实现了74.8%的J&F平均分数，在大规模YouTube-VOS数据集上实现了71.3%的总体分数G，同时在单个NVIDIA TITAN Xp GPU上保持25帧/秒的速度。此外，我们的更快版本以仅略微的精度损失运行31帧/秒。

相似文献

Directional Deep Embedding and Appearance Learning for Fast Video Object Segmentation.用于快速视频对象分割的定向深度嵌入与外观学习

IEEE Trans Neural Netw Learn Syst. 2022 Aug;33(8):3884-3894. doi: 10.1109/TNNLS.2021.3054769. Epub 2022 Aug 3.

Online Meta Adaptation for Fast Video Object Segmentation.用于快速视频对象分割的在线元自适应

IEEE Trans Pattern Anal Mach Intell. 2020 May;42(5):1205-1217. doi: 10.1109/TPAMI.2018.2890659. Epub 2019 Jan 14.

Region Aware Video Object Segmentation With Deep Motion Modeling.基于深度运动建模的区域感知视频对象分割

IEEE Trans Image Process. 2024;33:2639-2651. doi: 10.1109/TIP.2024.3381445. Epub 2024 Apr 3.

Meta-VOS: Learning to Adapt Online Target-Specific Segmentation.元虚拟目标分割：学习适应在线特定目标分割

IEEE Trans Image Process. 2021;30:4760-4772. doi: 10.1109/TIP.2021.3075086. Epub 2021 May 5.

Beyond Appearance: Multi-Frame Spatio-Temporal Context Memory Networks for Efficient and Robust Video Object Segmentation.超越表象：用于高效且稳健视频对象分割的多帧时空上下文记忆网络

IEEE Trans Image Process. 2024;33:4853-4866. doi: 10.1109/TIP.2024.3423390. Epub 2024 Sep 5.

SpVOS: Efficient Video Object Segmentation With Triple Sparse Convolution.SpVOS：基于三重稀疏卷积的高效视频对象分割

IEEE Trans Image Process. 2023;32:5977-5991. doi: 10.1109/TIP.2023.3327588. Epub 2023 Nov 7.

Self-Teaching Video Object Segmentation.自学视频目标分割

IEEE Trans Neural Netw Learn Syst. 2022 Apr;33(4):1623-1637. doi: 10.1109/TNNLS.2020.3043099. Epub 2022 Apr 4.

Video Salient Object Detection via Fully Convolutional Networks.基于全卷积网络的视频显著目标检测

IEEE Trans Image Process. 2018;27(1):38-49. doi: 10.1109/TIP.2017.2754941.

Adaptive Online Mutual Learning Bi-Decoders for Video Object Segmentation.用于视频对象分割的自适应在线互学习双解码器

IEEE Trans Image Process. 2022;31:7063-7077. doi: 10.1109/TIP.2022.3219230. Epub 2022 Nov 15.

Self Supervised Progressive Network for High Performance Video Object Segmentation.用于高性能视频对象分割的自监督渐进网络

IEEE Trans Neural Netw Learn Syst. 2024 Jun;35(6):7671-7684. doi: 10.1109/TNNLS.2022.3219936. Epub 2024 Jun 3.

用于快速视频对象分割的定向深度嵌入与外观学习

Directional Deep Embedding and Appearance Learning for Fast Video Object Segmentation.

作者信息

Yin Yingjie, Xu De, Wang Xingang, Zhang Lei

出版信息

IEEE Trans Neural Netw Learn Syst. 2022 Aug;33(8):3884-3894. doi: 10.1109/TNNLS.2021.3054769. Epub 2022 Aug 3.

DOI:10.1109/TNNLS.2021.3054769

PMID:33587708

Abstract

摘要

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

用于快速视频对象分割的定向深度嵌入与外观学习

Directional Deep Embedding and Appearance Learning for Fast Video Object Segmentation.

作者信息

出版信息

相似文献

用于快速视频对象分割的定向深度嵌入与外观学习

Directional Deep Embedding and Appearance Learning for Fast Video Object Segmentation.

作者信息

出版信息

相似文献