• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

全景视频中头部运动预测:一种深度强化学习方法。

Predicting Head Movement in Panoramic Video: A Deep Reinforcement Learning Approach.

出版信息

IEEE Trans Pattern Anal Mach Intell. 2019 Nov;41(11):2693-2708. doi: 10.1109/TPAMI.2018.2858783. Epub 2018 Jul 24.

DOI:10.1109/TPAMI.2018.2858783
PMID:30047871
Abstract

Panoramic video provides immersive and interactive experience by enabling humans to control the field of view (FoV) through head movement (HM). Thus, HM plays a key role in modeling human attention on panoramic video. This paper establishes a database collecting subjects' HM in panoramic video sequences. From this database, we find that the HM data are highly consistent across subjects. Furthermore, we find that deep reinforcement learning (DRL) can be applied to predict HM positions, via maximizing the reward of imitating human HM scanpaths through the agent's actions. Based on our findings, we propose a DRL-based HM prediction (DHP) approach with offline and online versions, called offline-DHP and online-DHP. In offline-DHP, multiple DRL workflows are run to determine potential HM positions at each panoramic frame. Then, a heat map of the potential HM positions, named the HM map, is generated as the output of offline-DHP. In online-DHP, the next HM position of one subject is estimated given the currently observed HM position, which is achieved by developing a DRL algorithm upon the learned offline-DHP model. Finally, the experiments validate that our approach is effective in both offline and online prediction of HM positions for panoramic video, and that the learned offline-DHP model can improve the performance of online-DHP.

摘要

全景视频通过允许人类通过头部运动 (HM) 控制视场 (FoV) 来提供沉浸式和交互式体验。因此,HM 在全景视频中的人类注意力建模中起着关键作用。本文建立了一个数据库,用于收集对象在全景视频序列中的 HM。从该数据库中,我们发现 HM 数据在不同对象之间高度一致。此外,我们发现深度强化学习 (DRL) 可用于通过代理的动作最大化模仿人类 HM 扫描路径的奖励来预测 HM 位置。基于我们的发现,我们提出了一种基于 DRL 的 HM 预测 (DHP) 方法,具有离线和在线版本,分别称为离线-DHP 和在线-DHP。在离线-DHP 中,运行多个 DRL 工作流以确定每个全景帧的潜在 HM 位置。然后,生成潜在 HM 位置的热图,称为 HM 图,作为离线-DHP 的输出。在线-DHP 中,根据当前观察到的 HM 位置估计一个对象的下一个 HM 位置,这是通过在学习到的离线-DHP 模型上开发 DRL 算法来实现的。最后,实验验证了我们的方法在全景视频的 HM 位置的离线和在线预测中都很有效,并且学习到的离线-DHP 模型可以提高在线-DHP 的性能。

相似文献

1
Predicting Head Movement in Panoramic Video: A Deep Reinforcement Learning Approach.全景视频中头部运动预测:一种深度强化学习方法。
IEEE Trans Pattern Anal Mach Intell. 2019 Nov;41(11):2693-2708. doi: 10.1109/TPAMI.2018.2858783. Epub 2018 Jul 24.
2
Saliency Prediction on Omnidirectional Image With Generative Adversarial Imitation Learning.基于生成对抗模仿学习的全向图像显著度预测。
IEEE Trans Image Process. 2021;30:2087-2102. doi: 10.1109/TIP.2021.3050861. Epub 2021 Jan 21.
3
Graph Learning Based Head Movement Prediction for Interactive 360 Video Streaming.基于图学习的交互式 360 视频流中头部运动预测。
IEEE Trans Image Process. 2021;30:4622-4636. doi: 10.1109/TIP.2021.3073283. Epub 2021 May 3.
4
Action-Driven Visual Object Tracking With Deep Reinforcement Learning.基于深度强化学习的驱动式视觉目标跟踪
IEEE Trans Neural Netw Learn Syst. 2018 Jun;29(6):2239-2252. doi: 10.1109/TNNLS.2018.2801826.
5
Camera-Assisted Video Saliency Prediction and Its Applications.相机辅助视频显著度预测及其应用。
IEEE Trans Cybern. 2018 Sep;48(9):2520-2530. doi: 10.1109/TCYB.2017.2741498. Epub 2017 Dec 21.
6
Incremental training of a detector using online sparse eigendecomposition.使用在线稀疏特征分解对检测器进行增量训练。
IEEE Trans Image Process. 2011 Jan;20(1):213-26. doi: 10.1109/TIP.2010.2053548. Epub 2010 Jun 21.
7
SGaze: A Data-Driven Eye-Head Coordination Model for Realtime Gaze Prediction.SGaze:用于实时眼-头协调预测的基于数据的眼-头协调模型。
IEEE Trans Vis Comput Graph. 2019 May;25(5):2002-2010. doi: 10.1109/TVCG.2019.2899187. Epub 2019 Feb 18.
8
A deep-learning approach for online cell identification and trace extraction in functional two-photon calcium imaging.一种用于在功能双光子钙成像中进行在线细胞识别和轨迹提取的深度学习方法。
Nat Commun. 2022 Mar 22;13(1):1529. doi: 10.1038/s41467-022-29180-0.
9
Viewport-Based CNN: A Multi-Task Approach for Assessing 360° Video Quality.基于视口的卷积神经网络:一种用于评估 360° 视频质量的多任务方法。
IEEE Trans Pattern Anal Mach Intell. 2022 Apr;44(4):2198-2215. doi: 10.1109/TPAMI.2020.3028509. Epub 2022 Mar 4.
10
Video-based head movement compensation for novel haploscopic eye-tracking apparatus.新型双目间接检眼镜眼动追踪设备的基于视频的头部运动补偿
Invest Ophthalmol Vis Sci. 2009 Mar;50(3):1152-7. doi: 10.1167/iovs.08-2739. Epub 2008 Oct 31.

引用本文的文献

1
Enhancing 360 Video Streaming through Salient Content in Head-Mounted Displays.通过头戴式显示器中的显著内容增强 360 视频流
Sensors (Basel). 2023 Apr 15;23(8):4016. doi: 10.3390/s23084016.