从连续语音的元音片段感知情绪的情感和活动水平。

Perception of emotional valences and activity levels from vowel segments of continuous speech.

机构信息

Department of Speech Communication and Voice Research, University of Tampere, Tampere, Finland.

出版信息

J Voice. 2010 Jan;24(1):30-8. doi: 10.1016/j.jvoice.2008.04.004. Epub 2008 Dec 25.

DOI:10.1016/j.jvoice.2008.04.004

PMID:19111438

Abstract

This study aimed to investigate the role of voice source and formant frequencies in the perception of emotional valence and psychophysiological activity level from short vowel samples (approximately 150 milliseconds). Nine professional actors (five males and four females) read a prose passage simulating joy, tenderness, sadness, anger, and a neutral emotional state. The stress carrying vowel [a:] was extracted from continuous speech during the Finnish word [ta:k:ahan] and analyzed for duration, fundamental frequency (F0), equivalent sound level (L(eq)), alpha ratio, and formant frequencies F1-F4. Alpha ratio was calculated by subtracting the L(eq) (dB) in the range 50 Hz-1 kHz from the L(eq) in the range 1-5 kHz. The samples were inverse filtered by Iterative Adaptive Inverse Filtering and the estimates of the glottal flow obtained were parameterized with the normalized amplitude quotient (NAQ = f(AC)/(d(peak)T)). Fifty listeners (mean age 28.5 years) identified the emotional valences from the randomized samples. Multinomial Logistic Regression Analysis was used to study the interrelations of the parameters for perception. It appeared to be possible to identify valences from vowel samples of short duration ( approximately 150 milliseconds). NAQ tended to differentiate between the valences and activity levels perceived in both genders. Voice source may not only reflect variations of F0 and L(eq), but may also have an independent role in expression, reflecting phonation types. To some extent, formant frequencies appeared to be related to valence perception but no clear patterns could be identified. Coding of valence tends to be a complicated multiparameter phenomenon with wide individual variation.

摘要

本研究旨在探讨元音源和共振峰频率在感知短元音样本（约 150 毫秒）的情感效价和心理生理活动水平中的作用。9 名专业演员（5 男 4 女）用模拟喜悦、温柔、悲伤、愤怒和中性情绪的方式朗读了一篇散文。从连续语音中提取出重音元音 [a:]，并对其进行时长、基频（F0）、等效声级（L(eq)）、α比和共振峰频率 F1-F4 的分析。α比是通过从 50 Hz-1 kHz 范围内的 L(eq)（dB）减去 1-5 kHz 范围内的 L(eq)计算得出的。样本通过迭代自适应逆滤波进行逆滤波，并对获得的声门波进行参数化，使用归一化幅度商（NAQ = f(AC)/(d(peak)T)）。50 名听众（平均年龄 28.5 岁）从随机样本中识别出情感效价。多项逻辑回归分析用于研究参数之间的相互关系。似乎可以从持续时间较短（约 150 毫秒）的元音样本中识别出效价。NAQ 倾向于区分两性感知到的效价和活动水平。元音源不仅可能反映 F0 和 L(eq)的变化，而且可能在表达中具有独立的作用，反映出不同的发声类型。在某种程度上，共振峰频率似乎与效价感知有关，但没有明确的模式可以确定。效价的编码往往是一个复杂的多参数现象，个体差异很大。

相似文献

Perception of emotional valences and activity levels from vowel segments of continuous speech.

J Voice. 2010 Jan;24(1):30-8. doi: 10.1016/j.jvoice.2008.04.004. Epub 2008 Dec 25.

Monopitched expression of emotions in different vowels.

Folia Phoniatr Logop. 2008;60(5):249-55. doi: 10.1159/000151762. Epub 2008 Sep 2.

Emotions in vowel segments of continuous speech: analysis of the glottal flow using the normalised amplitude quotient.

Phonetica. 2006;63(1):26-46. doi: 10.1159/000091405. Epub 2006 Mar 2.

Acoustic and EGG analyses of emotional utterances.

Logoped Phoniatr Vocol. 2013 Apr;38(1):11-8. doi: 10.3109/14015439.2012.679966. Epub 2012 May 15.

Emotions in [a]: a perceptual and acoustic study.

Logoped Phoniatr Vocol. 2006;31(1):43-8. doi: 10.1080/14015430500293926.

Formation of the actor's/speaker's formant: a study applying spectrum analysis and computer modeling.

J Voice. 2011 Mar;25(2):150-8. doi: 10.1016/j.jvoice.2009.10.002. Epub 2010 Apr 24.

Flow Glottogram Characteristics and Perceived Degree of Phonatory Pressedness.

J Voice. 2016 May;30(3):287-92. doi: 10.1016/j.jvoice.2015.03.014. Epub 2015 May 20.

Speaking fundamental frequency and vowel formant frequencies: effects on perception of gender.

J Voice. 2013 Sep;27(5):556-66. doi: 10.1016/j.jvoice.2012.11.008. Epub 2013 Feb 13.

Contributions of the glottal source and vocal tract cues to emotional vowel perception in the valence-arousal space.

J Acoust Soc Am. 2018 Aug;144(2):908. doi: 10.1121/1.5051323.

The speaker's formant.

J Voice. 2006 Dec;20(4):555-78. doi: 10.1016/j.jvoice.2005.07.001. Epub 2005 Dec 1.

引用本文的文献

Male-female specific changes in voice parameters under varying room acoustics.

Proc Meet Acoust. 2024 Nov 18;55(1). doi: 10.1121/2.0001979. Epub 2024 Dec 11.

Investigating the moments of "aha" and "hmm" through acoustic analysis of voice and speech in pre-service physics teacher education-A novel method for identifying significant learning moments.

PLoS One. 2025 Jan 24;20(1):e0314344. doi: 10.1371/journal.pone.0314344. eCollection 2025.

Sensitivity of Acoustic Voice Quality Measures in Simulated Reverberation Conditions.

Bioengineering (Basel). 2024 Dec 11;11(12):1253. doi: 10.3390/bioengineering11121253.

Utilizing vocalizations to gain insight into the affective states of non-human mammals.

Front Vet Sci. 2024 Feb 16;11:1366933. doi: 10.3389/fvets.2024.1366933. eCollection 2024.

The Sound of Emotional Prosody: Nearly 3 Decades of Research and Future Directions.

Perspect Psychol Sci. 2024 Jan 17:17456916231217722. doi: 10.1177/17456916231217722.

Acoustic perception and emotion evocation by rock art soundscapes of Altai (Russia).

Front Psychol. 2023 Sep 19;14:1188567. doi: 10.3389/fpsyg.2023.1188567. eCollection 2023.

Cortical haemodynamic responses predict individual ability to recognise vocal emotions with uninformative pitch cues but do not distinguish different emotions.

Hum Brain Mapp. 2023 Jun 15;44(9):3684-3705. doi: 10.1002/hbm.26305. Epub 2023 May 10.

Effects of Data Augmentations on Speech Emotion Recognition.

Sensors (Basel). 2022 Aug 9;22(16):5941. doi: 10.3390/s22165941.

A Moan of Pleasure Should Be Breathy: The Effect of Voice Quality on the Meaning of Human Nonverbal Vocalizations.

Phonetica. 2020;77(5):327-349. doi: 10.1159/000504855. Epub 2020 Jan 21.

Good vibrations: A review of vocal expressions of positive emotions.

Psychon Bull Rev. 2020 Apr;27(2):237-265. doi: 10.3758/s13423-019-01701-x.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

从连续语音的元音片段感知情绪的情感和活动水平。

Perception of emotional valences and activity levels from vowel segments of continuous speech.

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献