Suppr超能文献

将讲述者-故事映射作为一种在虚拟课堂场景中评估视听场景分析的方法。

Speaker-story mapping as a method to evaluate audiovisual scene analysis in a virtual classroom scenario.

作者信息

Fremerey Stephan, Breuer Carolin, Leist Larissa, Klatte Maria, Fels Janina, Raake Alexander

机构信息

Audiovisual Technology Group, Technische Universität Ilmenau, Ilmenau, Germany.

Institute for Hearing Technology and Acoustics, Rheinisch-Westfälische Technische Hochschule (RWTH) Aachen University, Aachen, Germany.

出版信息

Front Psychol. 2025 Jun 10;16:1520630. doi: 10.3389/fpsyg.2025.1520630. eCollection 2025.

Abstract

This study explores how audiovisual immersive virtual environments (IVEs) can assess cognitive performance in classroom-like settings, addressing limitations in simpler acoustic and visual representations. This study examines the potential of a test paradigm using speaker-story mapping, called "audiovisual scene analysis (AV-SA)," originally developed for virtual reality (VR) hearing research, as a method to evaluate audiovisual scene analysis in a virtual classroom scenario. Factors affecting acoustic and visual scene representation were varied to investigate their impact on audiovisual scene analysis. Two acoustic representations were used: a simple "diotic" presentation where the same signal is presented to both ears, as well as a dynamically live-rendered binaural synthesis ("binaural"). Two visual representations were used: 360°/omnidirectional video with intrinsic lip-sync and computer-generated imagery (CGI) without lip-sync. Three subjective experiments were conducted with different combinations of the two acoustic and visual conditions: The first experiment, involving 36 participants, used 360° video with "binaural" audio. The second experiment, with 24 participants, combined 360° video with "diotic" audio. The third experiment, with 34 participants, used the CGI environment with "binaural" audio. Each environment presented 20 different speakers in a classroom-like circle of 20 chairs, with the number of simultaneously active speakers ranging from 2 to 10, while the remaining speakers kept silent and were always shown. During the experiments, the subjects' task was to correctly map the stories' topics to the corresponding speakers. The primary dependent variable was the number of correct assignments during a fixed period of 2 min, followed by two questionnaires on mental load after each trial. In addition, before and/or after the experiments, subjects needed to complete questionnaires about simulator sickness, noise sensitivity, and presence. Results indicate that the experimental condition significantly influenced task performance, mental load, and user behavior but did not affect perceived simulator sickness and presence. Performance decreased when comparing the 360° video and "binaural" audio experiment with either the experiment using "diotic" audio and 360°, or using "binaural" audio with CGI-based, showing the usefulness of the test method in investigating influences on cognitive audiovisual scene analysis performance.

摘要

本研究探讨了视听沉浸式虚拟环境(IVEs)如何在类似教室的环境中评估认知表现,解决了简单声学和视觉表征中的局限性。本研究考察了一种使用说话者-故事映射的测试范式的潜力,即“视听场景分析(AV-SA)”,该范式最初是为虚拟现实(VR)听力研究而开发的,作为一种评估虚拟教室场景中视听场景分析的方法。改变影响声学和视觉场景表征的因素,以研究它们对视听场景分析的影响。使用了两种声学表征:一种简单的“双耳”呈现,即相同的信号同时呈现给双耳,以及动态实时渲染的双耳合成(“双耳”)。使用了两种视觉表征:具有内在唇同步的360°/全向视频和没有唇同步的计算机生成图像(CGI)。针对两种声学和视觉条件的不同组合进行了三项主观实验:第一个实验有36名参与者,使用了带有“双耳”音频的360°视频。第二个实验有24名参与者,将360°视频与“双耳”音频相结合。第三个实验有34名参与者,使用了带有“双耳”音频的CGI环境。每个环境在一个类似教室的由20把椅子围成的圆圈中展示20个不同的说话者,同时活跃的说话者数量从2到10不等,而其余说话者保持沉默并始终显示。在实验过程中,受试者的任务是将故事的主题正确映射到相应的说话者。主要因变量是在固定的2分钟时间段内正确分配的数量,随后在每次试验后进行两份关于心理负荷的问卷。此外,在实验之前和/或之后,受试者需要完成关于模拟器不适、噪声敏感性和临场感的问卷。结果表明,实验条件显著影响任务表现、心理负荷和用户行为,但不影响感知的模拟器不适和临场感。将360°视频和“双耳”音频实验与使用“双耳”音频和360°的实验或使用基于CGI的“双耳”音频的实验进行比较时,表现会下降,这表明该测试方法在研究对认知视听场景分析表现的影响方面是有用的。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4187/12185469/d6c53e3d9e29/fpsyg-16-1520630-g0001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验