Suppr超能文献

基于多模态融合的VR在线平台动作识别

Action recognition based on multimode fusion for VR online platform.

作者信息

Li Xuan, Chen Hengxin, He Shengdong, Chen Xinrun, Dong Shuang, Yan Ping, Fang Bin

机构信息

College of Computer Science, Chongqing University, Chongqing, 400044 China.

出版信息

Virtual Real. 2023 Feb 24:1-16. doi: 10.1007/s10055-023-00773-4.

Abstract

The current popular online communication platforms can convey information only in the form of text, voice, pictures, and other electronic means. The richness and reliability of information is not comparable to traditional face-to-face communication. The use of virtual reality (VR) technology for online communication is a viable alternative to face-to-face communication. In the current VR online communication platform, users are in a virtual world in the form of avatars, which can achieve "face-to-face" communication to a certain extent. However, the actions of the avatar do not follow the user, which makes the communication process less realistic. Decision-makers need to make decisions based on the behavior of VR users, but there are no effective methods for action data collection in VR environments. In our work, three modalities of nine actions from VR users are collected using a virtual reality head-mounted display (VR HMD) built-in sensors, RGB cameras and human pose estimation. Using these data and advanced multimodal fusion action recognition networks, we obtained a high accuracy action recognition model. In addition, we take advantage of the VR HMD to collect 3D position data and design a 2D key point augmentation scheme for VR users. Using the augmented 2D key point data and VR HMD sensor data, we can train action recognition models with high accuracy and strong stability. In data collection and experimental work, we focus our research on classroom scenes, and the results can be extended to other scenes.

摘要

当前流行的在线交流平台只能以文本、语音、图片等电子方式传递信息。信息的丰富性和可靠性无法与传统的面对面交流相比。使用虚拟现实(VR)技术进行在线交流是面对面交流的一种可行替代方案。在当前的VR在线交流平台中,用户以虚拟形象的形式处于虚拟世界中,这在一定程度上可以实现“面对面”交流。然而,虚拟形象的动作并不跟随用户,这使得交流过程不够真实。决策者需要根据VR用户的行为做出决策,但在VR环境中没有有效的动作数据收集方法。在我们的工作中,利用虚拟现实头戴式显示器(VR HMD)内置传感器、RGB摄像头和人体姿态估计,收集了VR用户九种动作的三种模态数据。利用这些数据和先进的多模态融合动作识别网络,我们获得了高精度的动作识别模型。此外,我们利用VR HMD收集3D位置数据,并为VR用户设计了一种二维关键点增强方案。利用增强后的二维关键点数据和VR HMD传感器数据,我们可以训练出高精度、强稳定性的动作识别模型。在数据收集和实验工作中,我们将研究重点放在课堂场景上,研究结果可推广到其他场景。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/566b/9955528/0c0bbf606a1c/10055_2023_773_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验