• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于零样本视频对象分割的分层图模式理解

Hierarchical Graph Pattern Understanding for Zero-Shot Video Object Segmentation.

作者信息

Pei Gensheng, Shen Fumin, Yao Yazhou, Chen Tao, Hua Xian-Sheng, Shen Heng-Tao

出版信息

IEEE Trans Image Process. 2023;32:5909-5920. doi: 10.1109/TIP.2023.3326395. Epub 2023 Nov 1.

DOI:10.1109/TIP.2023.3326395
PMID:37883290
Abstract

The optical flow guidance strategy is ideal for obtaining motion information of objects in the video. It is widely utilized in video segmentation tasks. However, existing optical flow-based methods have a significant dependency on optical flow, which results in poor performance when the optical flow estimation fails for a particular scene. The temporal consistency provided by the optical flow could be effectively supplemented by modeling in a structural form. This paper proposes a new hierarchical graph neural network (GNN) architecture, dubbed hierarchical graph pattern understanding (HGPU), for zero-shot video object segmentation (ZS-VOS). Inspired by the strong ability of GNNs in capturing structural relations, HGPU innovatively leverages motion cues (i.e., optical flow) to enhance the high-order representations from the neighbors of target frames. Specifically, a hierarchical graph pattern encoder with message aggregation is introduced to acquire different levels of motion and appearance features in a sequential manner. Furthermore, a decoder is designed for hierarchically parsing and understanding the transformed multi-modal contexts to achieve more accurate and robust results. HGPU achieves state-of-the-art performance on four publicly available benchmarks (DAVIS-16, YouTube-Objects, Long-Videos and DAVIS-17). Code and pre-trained model can be found at https://github.com/NUST-Machine-Intelligence-Laboratory/HGPU.

摘要

光流引导策略对于获取视频中物体的运动信息而言是理想的。它在视频分割任务中被广泛应用。然而,现有的基于光流的方法对光流有很大的依赖性,这导致在特定场景下光流估计失败时光流引导策略性能不佳。光流所提供的时间一致性可以通过结构化建模得到有效补充。本文提出了一种新的分层图神经网络(GNN)架构,称为分层图模式理解(HGPU),用于零样本视频对象分割(ZS-VOS)。受GNN在捕捉结构关系方面强大能力的启发,HGPU创新性地利用运动线索(即光流)来增强目标帧邻域的高阶表示。具体而言,引入了一种带有消息聚合的分层图模式编码器,以顺序方式获取不同层次的运动和外观特征。此外,设计了一个解码器,用于分层解析和理解变换后的多模态上下文,以获得更准确、更稳健的结果。HGPU在四个公开基准(DAVIS-16、YouTube-Objects、Long-Videos和DAVIS-17)上取得了领先的性能。代码和预训练模型可在https://github.com/NUST-Machine-Intelligence-Laboratory/HGPU上找到。

相似文献

1
Hierarchical Graph Pattern Understanding for Zero-Shot Video Object Segmentation.用于零样本视频对象分割的分层图模式理解
IEEE Trans Image Process. 2023;32:5909-5920. doi: 10.1109/TIP.2023.3326395. Epub 2023 Nov 1.
2
Hierarchical Co-Attention Propagation Network for Zero-Shot Video Object Segmentation.层次化协同注意传播网络的零样本视频对象分割。
IEEE Trans Image Process. 2023;32:2348-2359. doi: 10.1109/TIP.2023.3267244. Epub 2023 Apr 25.
3
MATNet: Motion-Attentive Transition Network for Zero-Shot Video Object Segmentation.MATNet:用于零样本视频对象分割的运动注意力过渡网络
IEEE Trans Image Process. 2020 Aug 12;PP. doi: 10.1109/TIP.2020.3013162.
4
Region Aware Video Object Segmentation With Deep Motion Modeling.基于深度运动建模的区域感知视频对象分割
IEEE Trans Image Process. 2024;33:2639-2651. doi: 10.1109/TIP.2024.3381445. Epub 2024 Apr 3.
5
Paying Attention to Video Object Pattern Understanding.关注视频对象模式理解。
IEEE Trans Pattern Anal Mach Intell. 2021 Jul;43(7):2413-2428. doi: 10.1109/TPAMI.2020.2966453. Epub 2021 Jun 8.
6
A visual object segmentation algorithm with spatial and temporal coherence inspired by the architecture of the visual cortex.一种基于视觉皮层结构的具有时空一致性的视觉目标分割算法。
Cogn Process. 2022 Feb;23(1):27-40. doi: 10.1007/s10339-021-01065-y. Epub 2021 Nov 15.
7
Holistic Prototype Activation for Few-Shot Segmentation.用于少样本分割的整体原型激活
IEEE Trans Pattern Anal Mach Intell. 2023 Apr;45(4):4650-4666. doi: 10.1109/TPAMI.2022.3193587. Epub 2023 Mar 7.
8
Learning Motion and Temporal Cues for Unsupervised Video Object Segmentation.用于无监督视频对象分割的运动和时间线索学习
IEEE Trans Neural Netw Learn Syst. 2025 May;36(5):9084-9097. doi: 10.1109/TNNLS.2024.3418980. Epub 2025 May 2.
9
Zero-Shot Video Object Segmentation With Co-Attention Siamese Networks.基于协同注意力暹罗网络的零样本视频目标分割
IEEE Trans Pattern Anal Mach Intell. 2022 Apr;44(4):2228-2242. doi: 10.1109/TPAMI.2020.3040258. Epub 2022 Mar 4.
10
Unsupervised Online Video Object Segmentation With Motion Property Understanding.基于运动属性理解的无监督在线视频对象分割。
IEEE Trans Image Process. 2020;29:237-249. doi: 10.1109/TIP.2019.2930152. Epub 2019 Jul 26.