School of Information Science, Japan Advanced Institute of Science and Technology, 1-1 Asahidai, Nomi, Ishikawa 923-1292, Japan.
Institute of Acoustics, Chinese Academy of Sciences, 21 North 4th Ring Road, Haidian District, Beijing 100190, People's Republic of China.
J Acoust Soc Am. 2018 Aug;144(2):908. doi: 10.1121/1.5051323.
Motivated by the source-filter model of speech production, analysis of emotional speech based on the inverse-filtering method has been extensively conducted. The relative contribution of the glottal source and vocal tract cues to perception of emotions in speech is still unclear, especially after removing the effects of the known dominant factors (e.g., , intensity, and duration). In this present study, the glottal source and vocal tract parameters were estimated in a simultaneous manner, modified in a controlled way and then used for resynthesizing emotional Japanese vowels by applying a recently developed analysis-by-synthesis method. The resynthesized emotional vowels were presented to native Japanese listeners with normal hearing for perceptually rating emotions in valence and arousal dimensions. Results showed that glottal source information played a dominant role in perception of emotions in vowels, while vocal tract information contributed to valence and arousal perceptions after neutralizing the effects of , intensity, and duration cues.
受语音产生的源-滤波器模型的启发,基于逆滤波方法的情感语音分析得到了广泛的研究。在去除已知主导因素(例如,强度和时长)的影响后,声门源和声道线索对语音中情感感知的相对贡献仍然不清楚。在本研究中,以同步的方式估计声门源和声道参数,以受控的方式进行修改,然后应用最近开发的分析-综合方法,用于重新合成情感日语元音。将重新合成的情感元音呈现给具有正常听力的母语为日语的听众,让他们在效价和唤醒维度上对情感进行感知评价。结果表明,声门源信息在元音情感感知中起主导作用,而声道信息在中和了、强度和时长线索的影响后,对效价和唤醒感知有贡献。