将讲述者-故事映射作为一种在虚拟课堂场景中评估视听场景分析的方法。

Speaker-story mapping as a method to evaluate audiovisual scene analysis in a virtual classroom scenario.

作者信息

Fremerey Stephan, Breuer Carolin, Leist Larissa, Klatte Maria, Fels Janina, Raake Alexander

机构信息

Audiovisual Technology Group, Technische Universität Ilmenau, Ilmenau, Germany.

Institute for Hearing Technology and Acoustics, Rheinisch-Westfälische Technische Hochschule (RWTH) Aachen University, Aachen, Germany.

出版信息

Front Psychol. 2025 Jun 10;16:1520630. doi: 10.3389/fpsyg.2025.1520630. eCollection 2025.

DOI:10.3389/fpsyg.2025.1520630

PMID:40557366

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12185469/

Abstract

This study explores how audiovisual immersive virtual environments (IVEs) can assess cognitive performance in classroom-like settings, addressing limitations in simpler acoustic and visual representations. This study examines the potential of a test paradigm using speaker-story mapping, called "audiovisual scene analysis (AV-SA)," originally developed for virtual reality (VR) hearing research, as a method to evaluate audiovisual scene analysis in a virtual classroom scenario. Factors affecting acoustic and visual scene representation were varied to investigate their impact on audiovisual scene analysis. Two acoustic representations were used: a simple "diotic" presentation where the same signal is presented to both ears, as well as a dynamically live-rendered binaural synthesis ("binaural"). Two visual representations were used: 360°/omnidirectional video with intrinsic lip-sync and computer-generated imagery (CGI) without lip-sync. Three subjective experiments were conducted with different combinations of the two acoustic and visual conditions: The first experiment, involving 36 participants, used 360° video with "binaural" audio. The second experiment, with 24 participants, combined 360° video with "diotic" audio. The third experiment, with 34 participants, used the CGI environment with "binaural" audio. Each environment presented 20 different speakers in a classroom-like circle of 20 chairs, with the number of simultaneously active speakers ranging from 2 to 10, while the remaining speakers kept silent and were always shown. During the experiments, the subjects' task was to correctly map the stories' topics to the corresponding speakers. The primary dependent variable was the number of correct assignments during a fixed period of 2 min, followed by two questionnaires on mental load after each trial. In addition, before and/or after the experiments, subjects needed to complete questionnaires about simulator sickness, noise sensitivity, and presence. Results indicate that the experimental condition significantly influenced task performance, mental load, and user behavior but did not affect perceived simulator sickness and presence. Performance decreased when comparing the 360° video and "binaural" audio experiment with either the experiment using "diotic" audio and 360°, or using "binaural" audio with CGI-based, showing the usefulness of the test method in investigating influences on cognitive audiovisual scene analysis performance.

摘要

本研究探讨了视听沉浸式虚拟环境（IVEs）如何在类似教室的环境中评估认知表现，解决了简单声学和视觉表征中的局限性。本研究考察了一种使用说话者-故事映射的测试范式的潜力，即“视听场景分析（AV-SA）”，该范式最初是为虚拟现实（VR）听力研究而开发的，作为一种评估虚拟教室场景中视听场景分析的方法。改变影响声学和视觉场景表征的因素，以研究它们对视听场景分析的影响。使用了两种声学表征：一种简单的“双耳”呈现，即相同的信号同时呈现给双耳，以及动态实时渲染的双耳合成（“双耳”）。使用了两种视觉表征：具有内在唇同步的360°/全向视频和没有唇同步的计算机生成图像（CGI）。针对两种声学和视觉条件的不同组合进行了三项主观实验：第一个实验有36名参与者，使用了带有“双耳”音频的360°视频。第二个实验有24名参与者，将360°视频与“双耳”音频相结合。第三个实验有34名参与者，使用了带有“双耳”音频的CGI环境。每个环境在一个类似教室的由20把椅子围成的圆圈中展示20个不同的说话者，同时活跃的说话者数量从2到10不等，而其余说话者保持沉默并始终显示。在实验过程中，受试者的任务是将故事的主题正确映射到相应的说话者。主要因变量是在固定的2分钟时间段内正确分配的数量，随后在每次试验后进行两份关于心理负荷的问卷。此外，在实验之前和/或之后，受试者需要完成关于模拟器不适、噪声敏感性和临场感的问卷。结果表明，实验条件显著影响任务表现、心理负荷和用户行为，但不影响感知的模拟器不适和临场感。将360°视频和“双耳”音频实验与使用“双耳”音频和360°的实验或使用基于CGI的“双耳”音频的实验进行比较时，表现会下降，这表明该测试方法在研究对认知视听场景分析表现的影响方面是有用的。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4187/12185469/d6c53e3d9e29/fpsyg-16-1520630-g0001.jpg

相似文献

Speaker-story mapping as a method to evaluate audiovisual scene analysis in a virtual classroom scenario.将讲述者-故事映射作为一种在虚拟课堂场景中评估视听场景分析的方法。

Front Psychol. 2025 Jun 10;16:1520630. doi: 10.3389/fpsyg.2025.1520630. eCollection 2025.

Signs and symptoms to determine if a patient presenting in primary care or hospital outpatient settings has COVID-19.在基层医疗机构或医院门诊环境中，如果患者出现以下症状和体征，可判断其是否患有 COVID-19。

Cochrane Database Syst Rev. 2022 May 20;5(5):CD013665. doi: 10.1002/14651858.CD013665.pub3.

Virtual reality for stroke rehabilitation.用于中风康复的虚拟现实技术。

Cochrane Database Syst Rev. 2025 Jun 20;6:CD008349. doi: 10.1002/14651858.CD008349.pub5.

Intravenous magnesium sulphate and sotalol for prevention of atrial fibrillation after coronary artery bypass surgery: a systematic review and economic evaluation.静脉注射硫酸镁和索他洛尔预防冠状动脉搭桥术后房颤：系统评价与经济学评估

Health Technol Assess. 2008 Jun;12(28):iii-iv, ix-95. doi: 10.3310/hta12280.

Antiretrovirals for reducing the risk of mother-to-child transmission of HIV infection.用于降低人类免疫缺陷病毒感染母婴传播风险的抗逆转录病毒药物。

Cochrane Database Syst Rev. 2007 Jan 24(1):CD003510. doi: 10.1002/14651858.CD003510.pub2.

A rapid and systematic review of the clinical effectiveness and cost-effectiveness of paclitaxel, docetaxel, gemcitabine and vinorelbine in non-small-cell lung cancer.对紫杉醇、多西他赛、吉西他滨和长春瑞滨在非小细胞肺癌中的临床疗效和成本效益进行的快速系统评价。

Health Technol Assess. 2001;5(32):1-195. doi: 10.3310/hta5320.

Eliciting adverse effects data from participants in clinical trials.从临床试验参与者中获取不良反应数据。

Cochrane Database Syst Rev. 2018 Jan 16;1(1):MR000039. doi: 10.1002/14651858.MR000039.pub2.

Rapid, point-of-care antigen tests for diagnosis of SARS-CoV-2 infection.用于 SARS-CoV-2 感染诊断的快速、即时抗原检测。

Cochrane Database Syst Rev. 2022 Jul 22;7(7):CD013705. doi: 10.1002/14651858.CD013705.pub3.

Bilateral versus unilateral hearing aids for bilateral hearing impairment in adults.成人双侧听力障碍使用双侧助听器与单侧助听器的比较。

Cochrane Database Syst Rev. 2017 Dec 19;12(12):CD012665. doi: 10.1002/14651858.CD012665.pub2.

Magnetic resonance perfusion for differentiating low-grade from high-grade gliomas at first presentation.首次就诊时磁共振灌注成像用于鉴别低级别与高级别胶质瘤

Cochrane Database Syst Rev. 2018 Jan 22;1(1):CD011551. doi: 10.1002/14651858.CD011551.pub2.

本文引用的文献

Acoustic scene complexity affects motion behavior during speech perception in audio-visual multi-talker virtual environments.听觉场景复杂性会影响视听多说话人虚拟环境中言语感知过程中的运动行为。

Sci Rep. 2024 Aug 16;14(1):19028. doi: 10.1038/s41598-024-70026-0.

Auditory spatial analysis in reverberant multi-talker environments with congruent and incongruent audio-visual room information.在具有一致和不一致视听房间信息的混响多说话人环境中进行听觉空间分析。

J Acoust Soc Am. 2022 Sep;152(3):1586. doi: 10.1121/10.0013991.

Using Virtual Reality to Assess Auditory Performance.使用虚拟现实技术评估听觉表现。

Hear J. 2019 Jun;72(6):20-23. doi: 10.1097/01.hj.0000558464.75151.52.

Top-down task-specific determinants of multisensory motor reaction time enhancements and sensory switch costs.自上而下的任务特异性决定因素对多感觉运动反应时间增强和感觉转换代价的影响。

Exp Brain Res. 2021 Mar;239(3):1021-1034. doi: 10.1007/s00221-020-06014-3. Epub 2021 Jan 30.

A multimedia speech corpus for audio visual research in virtual reality (L).用于虚拟现实视听研究的多媒体语音语料库（L）

J Acoust Soc Am. 2020 Aug;148(2):492. doi: 10.1121/10.0001670.

Validation of the Raw National Aeronautics and Space Administration Task Load Index (NASA-TLX) Questionnaire to Assess Perceived Workload in Patient Monitoring Tasks: Pooled Analysis Study Using Mixed Models.验证原始美国国家航空航天局任务负荷指数（NASA-TLX）问卷在评估患者监护任务中的感知工作量的有效性：使用混合模型的 pooled 分析研究。

J Med Internet Res. 2020 Sep 7;22(9):e19472. doi: 10.2196/19472.

[The "National Aeronautics and Space Administration-Task Load Index" (NASA-TLX) - an instrument for measuring consultation workload within general practice: evaluation of psychometric properties].["美国国家航空航天局任务负荷指数"（NASA-TLX）——一种测量全科医疗中会诊工作量的工具：心理测量特性评估]

Z Evid Fortbild Qual Gesundhwes. 2019 Nov;147-148:90-96. doi: 10.1016/j.zefq.2019.10.003. Epub 2019 Nov 20.

Psychometric evaluation of Simulator Sickness Questionnaire and its variants as a measure of cybersickness in consumer virtual environments.消费者虚拟环境中网络晕动症测量的模拟器晕动症问卷及其变体的心理计量学评估。

Appl Ergon. 2020 Jan;82:102958. doi: 10.1016/j.apergo.2019.102958. Epub 2019 Sep 26.

NASA RTLX as a Novel Assessment for Determining Cognitive Load and User Acceptance of Expert and User-Based Evaluation Methods Exemplified Through a mHealth Diabetes Self-Management Application Evaluation.美国国家航空航天局实时任务负荷指数（NASA RTLX）作为一种新型评估方法，用于确定认知负荷以及专家和基于用户的评估方法的用户接受度，以一款移动健康糖尿病自我管理应用程序评估为例进行说明。

Stud Health Technol Inform. 2019;261:185-190.

Equivalent auditory distraction in children and adults.儿童和成人中的等效听觉干扰。

J Exp Child Psychol. 2018 Aug;172:41-58. doi: 10.1016/j.jecp.2018.02.005. Epub 2018 Mar 23.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

将讲述者-故事映射作为一种在虚拟课堂场景中评估视听场景分析的方法。

Speaker-story mapping as a method to evaluate audiovisual scene analysis in a virtual classroom scenario.

作者信息

机构信息

出版信息

相似文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

本文引用的文献