通过深度上下文时空网络实现的端到端视频显著目标检测

End-to-End Video Saliency Detection via a Deep Contextual Spatiotemporal Network.

作者信息

Wei Lina, Zhao Shanshan, Bourahla Omar Farouk, Li Xi, Wu Fei, Zhuang Yueting, Han Junwei, Xu Mingliang

出版信息

IEEE Trans Neural Netw Learn Syst. 2021 Apr;32(4):1691-1702. doi: 10.1109/TNNLS.2020.2986823. Epub 2021 Apr 2.

DOI:10.1109/TNNLS.2020.2986823

Abstract

As an interesting and important problem in computer vision, learning-based video saliency detection aims to discover the visually interesting regions in a video sequence. Capturing the information within frame and between frame at different aspects (such as spatial contexts, motion information, temporal consistency across frames, and multiscale representation) is important for this task. A key issue is how to jointly model all these factors within a unified data-driven scheme in an end-to-end fashion. In this article, we propose an end-to-end spatiotemporal deep video saliency detection approach, which captures the information on spatial contexts and motion characteristics. Furthermore, it encodes the temporal consistency information across the consecutive frames by implementing a convolutional long short-term memory (Conv-LSTM) model. In addition, the multiscale saliency properties for each frame are adaptively integrated for final saliency prediction in a collaborative feature-pyramid way. Finally, the proposed deep learning approach unifies all the aforementioned parts into an end-to-end joint deep learning scheme. Experimental results demonstrate the effectiveness of our approach in comparison with the state-of-the-art approaches.

摘要

作为计算机视觉中一个有趣且重要的问题，基于学习的视频显著性检测旨在发现视频序列中视觉上引人关注的区域。从不同方面（如空间上下文、运动信息、帧间的时间一致性以及多尺度表示）捕捉帧内和帧间的信息对于此任务至关重要。一个关键问题是如何以端到端的方式在统一的数据驱动方案中对所有这些因素进行联合建模。在本文中，我们提出了一种端到端的时空深度视频显著性检测方法，该方法捕捉空间上下文和运动特征方面的信息。此外，它通过实现卷积长短期记忆（Conv-LSTM）模型对连续帧之间的时间一致性信息进行编码。另外，以协作特征金字塔的方式自适应地整合每一帧的多尺度显著性属性以进行最终的显著性预测。最后，所提出的深度学习方法将上述所有部分统一为一个端到端的联合深度学习方案。实验结果表明，与现有最先进的方法相比，我们的方法是有效的。

相似文献

End-to-End Video Saliency Detection via a Deep Contextual Spatiotemporal Network.通过深度上下文时空网络实现的端到端视频显著目标检测

IEEE Trans Neural Netw Learn Syst. 2021 Apr;32(4):1691-1702. doi: 10.1109/TNNLS.2020.2986823. Epub 2021 Apr 2.

Video Salient Object Detection via Fully Convolutional Networks.基于全卷积网络的视频显著目标检测

IEEE Trans Image Process. 2018;27(1):38-49. doi: 10.1109/TIP.2017.2754941.

Deep Group-wise Fully Convolutional Network for Co-saliency Detection with Graph Propagation.基于图传播的深度分组全卷积网络协同显著性检测

IEEE Trans Image Process. 2019 Apr 15. doi: 10.1109/TIP.2019.2909649.

A Spatial-Temporal Recurrent Neural Network for Video Saliency Prediction.一种用于视频显著性预测的时空循环神经网络。

IEEE Trans Image Process. 2021;30:572-587. doi: 10.1109/TIP.2020.3036749. Epub 2020 Nov 24.

SG-FCN: A Motion and Memory-Based Deep Learning Model for Video Saliency Detection.SG-FCN：一种基于运动和记忆的视频显著性检测深度学习模型。

IEEE Trans Cybern. 2019 Aug;49(8):2900-2911. doi: 10.1109/TCYB.2018.2832053. Epub 2018 May 25.

A Deep Spatial Contextual Long-Term Recurrent Convolutional Network for Saliency Detection.基于深度空间上下文的显著性检测长短期记忆卷积网络。

IEEE Trans Image Process. 2018 Jul;27(7):3264-3274. doi: 10.1109/TIP.2018.2817047.

DeepSaliency: Multi-Task Deep Neural Network Model for Salient Object Detection.DeepSaliency：用于显著目标检测的多任务深度神经网络模型。

IEEE Trans Image Process. 2016 Aug;25(8):3919-30. doi: 10.1109/TIP.2016.2579306. Epub 2016 Jun 9.

Deep3DSaliency: Deep Stereoscopic Video Saliency Detection Model by 3D Convolutional Networks.深度3D显著度：基于3D卷积网络的深度立体视频显著度检测模型

IEEE Trans Image Process. 2018 Dec 5. doi: 10.1109/TIP.2018.2885229.

Video Saliency Detection via Sparsity-Based Reconstruction and Propagation.基于稀疏重建与传播的视频显著度检测

IEEE Trans Image Process. 2019 Oct;28(10):4819-4831. doi: 10.1109/TIP.2019.2910377. Epub 2019 May 2.

Video Salient Object Detection Using Spatiotemporal Deep Features.基于时空深度特征的视频显著目标检测

IEEE Trans Image Process. 2018 Oct;27(10):5002-5015. doi: 10.1109/TIP.2018.2849860.

引用本文的文献

Revolution or Evolution? Technical Requirements and Considerations towards 6G Mobile Communications.革命还是演进？面向 6G 移动通信的技术需求与考量。

Sensors (Basel). 2022 Jan 20;22(3):762. doi: 10.3390/s22030762.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

通过深度上下文时空网络实现的端到端视频显著目标检测

End-to-End Video Saliency Detection via a Deep Contextual Spatiotemporal Network.

作者信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献