嗓音质量因素：分析、合成与感知。

Vocal quality factors: analysis, synthesis, and perception.

作者信息

Childers D G, Lee C K

机构信息

Department of Electrical Engineering, University of Florida, Gainesville 32611-2024.

出版信息

J Acoust Soc Am. 1991 Nov;90(5):2394-410. doi: 10.1121/1.402044.

DOI:10.1121/1.402044

PMID:1837797

Abstract

The purpose of this study was to examine several factors of vocal quality that might be affected by changes in vocal fold vibratory patterns. Four voice types were examined: modal, vocal fry, falsetto, and breathy. Three categories of analysis techniques were developed to extract source-related features from speech and electroglottographic (EGG) signals. Four factors were found to be important for characterizing the glottal excitations for the four voice types: the glottal pulse width, the glottal pulse skewness, the abruptness of glottal closure, and the turbulent noise component. The significance of these factors for voice synthesis was studied and a new voice source model that accounted for certain physiological aspects of vocal fold motion was developed and tested using speech synthesis. Perceptual listening tests were conducted to evaluate the auditory effects of the source model parameters upon synthesized speech. The effects of the spectral slope of the source excitation, the shape of the glottal excitation pulse, and the characteristics of the turbulent noise source were considered. Applications for these research results include synthesis of natural sounding speech, synthesis and modeling of vocal disorders, and the development of speaker independent (or adaptive) speech recognition systems.

摘要

本研究的目的是考察可能受声带振动模式变化影响的几个嗓音质量因素。研究了四种嗓音类型：模态嗓音、气泡音、假声和呼吸音。开发了三类分析技术，用于从语音和电声门图（EGG）信号中提取与声源相关的特征。发现有四个因素对于表征四种嗓音类型的声门激励很重要：声门脉冲宽度、声门脉冲偏度、声门闭合的突然性以及湍流噪声成分。研究了这些因素对语音合成的重要性，并开发了一种新的声源模型，该模型考虑了声带运动的某些生理方面，并使用语音合成进行了测试。进行了感知听力测试，以评估声源模型参数对合成语音的听觉效果。考虑了声源激励的频谱斜率、声门激励脉冲的形状以及湍流噪声源的特征的影响。这些研究结果的应用包括自然语音合成、嗓音障碍的合成与建模，以及独立于说话者（或自适应）的语音识别系统的开发。