Suppr超能文献

基于灵活正则化混合模型的无监督视频分割算法

UNSUPERVISED VIDEO SEGMENTATION ALGORITHMS BASED ON FLEXIBLY REGULARIZED MIXTURE MODELS.

作者信息

Launay Claire, Vacher Jonathan, Coen-Cagli Ruben

机构信息

Dept. of Systems & Comp. Biology, AECOM, Bronx, NY, USA.

Laboratoire des Systèmes Perceptifs, DEC, ENS, PSL University, CNRS, Paris, France.

出版信息

Proc Int Conf Image Proc. 2022 Oct;2022:4073-4077. doi: 10.1109/icip46576.2022.9897691. Epub 2022 Oct 18.

Abstract

We propose a family of probabilistic segmentation algorithms for videos that rely on a generative model capturing static and dynamic natural image statistics. Our framework adopts flexibly regularized mixture models (FlexMM) [1], an efficient method to combine mixture distributions across different data sources. FlexMMs of Student-t distributions successfully segment static natural images, through uncertainty-based information sharing between hidden layers of CNNs. We further extend this approach to videos and exploit FlexMM to propagate segment labels across space and time. We show that temporal propagation improves temporal consistency of segmentation, reproducing qualitatively a key aspect of human perceptual grouping. Besides, Student-t distributions can capture statistics of optical flows of natural movies, which represent apparent motion in the video. Integrating these motion cues in our temporal FlexMM further enhances the segmentation of each frame of natural movies. Our probabilistic dynamic segmentation algorithms thus provide a new framework to study uncertainty in human dynamic perceptual segmentation.

摘要

我们提出了一族用于视频的概率分割算法,这些算法依赖于一个捕捉静态和动态自然图像统计信息的生成模型。我们的框架采用灵活正则化混合模型(FlexMM)[1],这是一种在不同数据源之间组合混合分布的有效方法。学生t分布的FlexMM通过基于不确定性的卷积神经网络隐藏层之间的信息共享,成功地分割静态自然图像。我们进一步将此方法扩展到视频,并利用FlexMM在空间和时间上传播分割标签。我们表明,时间传播提高了分割的时间一致性,定性地再现了人类感知分组的一个关键方面。此外,学生t分布可以捕捉自然电影光流的统计信息,光流代表视频中的表观运动。将这些运动线索整合到我们的时间FlexMM中,进一步增强了自然电影每一帧的分割。因此,我们的概率动态分割算法提供了一个新的框架来研究人类动态感知分割中的不确定性。

相似文献

1
UNSUPERVISED VIDEO SEGMENTATION ALGORITHMS BASED ON FLEXIBLY REGULARIZED MIXTURE MODELS.
Proc Int Conf Image Proc. 2022 Oct;2022:4073-4077. doi: 10.1109/icip46576.2022.9897691. Epub 2022 Oct 18.
2
Flexibly regularized mixture models and application to image segmentation.
Neural Netw. 2022 May;149:107-123. doi: 10.1016/j.neunet.2022.02.010. Epub 2022 Feb 15.
3
BM3 E: discriminative density propagation for visual tracking.
IEEE Trans Pattern Anal Mach Intell. 2007 Nov;29(11):2030-44. doi: 10.1109/TPAMI.2007.1111.
4
Context-based segmentation of image sequences.
IEEE Trans Pattern Anal Mach Intell. 2006 Mar;28(3):463-8. doi: 10.1109/TPAMI.2006.47.
5
Video Salient Object Detection via Fully Convolutional Networks.
IEEE Trans Image Process. 2018;27(1):38-49. doi: 10.1109/TIP.2017.2754941.
6
Learning Motion and Temporal Cues for Unsupervised Video Object Segmentation.
IEEE Trans Neural Netw Learn Syst. 2025 May;36(5):9084-9097. doi: 10.1109/TNNLS.2024.3418980. Epub 2025 May 2.
7
Deep Video Prior for Video Consistency and Propagation.
IEEE Trans Pattern Anal Mach Intell. 2023 Jan;45(1):356-371. doi: 10.1109/TPAMI.2022.3142071. Epub 2022 Dec 5.
8
Selecting salient frames for spatiotemporal video modeling and segmentation.
IEEE Trans Image Process. 2007 Dec;16(12):3035-46. doi: 10.1109/tip.2007.908283.
9
Motion-Guided Cascaded Refinement Network for Video Object Segmentation.
IEEE Trans Pattern Anal Mach Intell. 2020 Aug;42(8):1957-1967. doi: 10.1109/TPAMI.2019.2906175. Epub 2019 Mar 19.
10
Segmentation according to natural examples: learning static segmentation from motion segmentation.
IEEE Trans Pattern Anal Mach Intell. 2009 Apr;31(4):661-76. doi: 10.1109/TPAMI.2008.109.

引用本文的文献

1
Measuring uncertainty in human visual segmentation.
PLoS Comput Biol. 2023 Sep 25;19(9):e1011483. doi: 10.1371/journal.pcbi.1011483. eCollection 2023 Sep.

本文引用的文献

1
Texture Interpolation for Probing Visual Perception.
Adv Neural Inf Process Syst. 2020 Dec;33:22146-22157.
2
Flexibly regularized mixture models and application to image segmentation.
Neural Netw. 2022 May;149:107-123. doi: 10.1016/j.neunet.2022.02.010. Epub 2022 Feb 15.
3
Normalization and pooling in hierarchical models of natural images.
Curr Opin Neurobiol. 2019 Apr;55:65-72. doi: 10.1016/j.conb.2019.01.008. Epub 2019 Feb 18.
4
Segmentation of Moving Objects by Long Term Video Analysis.
IEEE Trans Pattern Anal Mach Intell. 2014 Jun;36(6):1187-200. doi: 10.1109/TPAMI.2013.242.
5
A century of Gestalt psychology in visual perception: I. Perceptual grouping and figure-ground organization.
Psychol Bull. 2012 Nov;138(6):1172-217. doi: 10.1037/a0029333. Epub 2012 Jul 30.
6
Robust Student's-t mixture model with spatial constraints and its application in medical image segmentation.
IEEE Trans Med Imaging. 2012 Jan;31(1):103-16. doi: 10.1109/TMI.2011.2165342. Epub 2011 Aug 18.
7
Object segmentation from motion discontinuities and temporal occlusions--a biologically inspired model.
PLoS One. 2008;3(11):e3807. doi: 10.1371/journal.pone.0003807. Epub 2008 Nov 27.
8
A model of neuronal responses in visual area MT.
Vision Res. 1998 Mar;38(5):743-61. doi: 10.1016/s0042-6989(97)00183-1.
9
Motion integration across differing image features.
Vision Res. 1995 Aug;35(15):2137-46. doi: 10.1016/0042-6989(94)00299-1.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验