Suppr超能文献

高唤醒度情感语音可提高噪声环境下的语音清晰度和情感识别能力。

High-arousal emotional speech enhances speech intelligibility and emotion recognition in noise.

作者信息

Alexander Jessica M, Llanos Fernando

机构信息

Department of Linguistics, University of Texas at Austin, Austin, Texas 78712, USA.

出版信息

J Acoust Soc Am. 2025 Jun 1;157(6):4085-4096. doi: 10.1121/10.0036812.

Abstract

Prosodic and voice quality modulations of the speech signal offer acoustic cues to the emotional state of the speaker. In quiet, listeners are highly adept at identifying not only a speaker's words but also the underlying emotional context. Given that distinct vocal emotions possess varying acoustic characteristics, background noise level may differentially impact speech recognition, emotion recognition, or their interaction. To investigate this question, we assessed the effects of three emotional speech styles (angry, happy, neutral) on speech intelligibility and emotion recognition across four different SNR levels. High-arousal emotional speech styles (happy and angry speech) enhanced both speech intelligibility and emotion recognition in noise. However, emotion recognition behavior was not a reliable predictor of speech recognition behavior. Instead, we found a strong correspondence between speech recognition scores and the relative power of the speech-in-noise signal in critical bands derived from the Speech Intelligibility Index. Unsupervised dimensional scaling analysis of emotion recognition patterns revealed that different noise baselines elicit different perceptual cue weighting strategies. Further dimensional scaling analysis revealed that emotion recognition patterns were best predicted by emotion-level differences in harmonic-to-noise ratio and variability around the fundamental frequency. Listeners may thus weight acoustic features differently for recognizing speech versus emotional patterns.

摘要

语音信号的韵律和语音质量调制为说话者的情绪状态提供了声学线索。在安静环境中,听众不仅非常擅长识别说话者的话语,还能识别潜在的情绪背景。鉴于不同的声音情绪具有不同的声学特征,背景噪声水平可能会对语音识别、情绪识别或它们之间的相互作用产生不同的影响。为了研究这个问题,我们评估了三种情绪语音风格(愤怒、高兴、中性)在四个不同信噪比水平下对语音清晰度和情绪识别的影响。高唤醒情绪语音风格(高兴和愤怒的语音)在噪声环境中提高了语音清晰度和情绪识别能力。然而,情绪识别行为并不是语音识别行为的可靠预测指标。相反,我们发现语音识别分数与基于语音清晰度指数得出的临界频带中语音噪声信号的相对功率之间存在很强的对应关系。对情绪识别模式的无监督维度缩放分析表明,不同的噪声基线会引发不同的感知线索加权策略。进一步的维度缩放分析表明,情绪识别模式最好由谐波与噪声比率的情绪水平差异以及基频周围的可变性来预测。因此,听众在识别语音和情绪模式时可能会对声学特征进行不同的加权。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验