Kreiman Jody, Gerratt Bruce R, Antoñanzas-Barroso Norma
Division of Head and Neck Surgery, University of California, Los Angeles, School of Medicine, 31-24 Rehab Center, 1000 Veteran Avenue, Los Angeles, CA 90095-1794, USA.
J Speech Lang Hear Res. 2007 Jun;50(3):595-610. doi: 10.1044/1092-4388(2007/042).
Many researchers have studied the acoustics, physiology, and perceptual characteristics of the voice source, but despite significant attention, it remains unclear which aspects of the source should be quantified and how measurements should be made. In this study, the authors examined the relationships among a number of existing measures of the glottal source spectrum, along with the association of these measures to overall spectral shapes and to glottal pulse shapes, to determine which measures of the source best capture information about the shapes of glottal pulses and glottal source spectra.
Seventy-eight different measures of source spectral shapes were made on the voices of 70 speakers. Principal components analysis was applied to measurement data, and the resulting factors were compared with factors similarly derived from oral speech spectra and glottal pulses.
Results revealed high levels of duplication and overlap among existing measures of source spectral slope. Further, existing measures were not well aligned with patterns of spectral variability. In particular, existing spectral measures do not appear to model the higher frequency parts of the source spectrum adequately.
The failure of existing measures to adequately quantify spectral variability may explain why results of studies examining the perceptual importance of spectral slope have not produced consistent results. Because variability in the speech signal is often perceptually salient, these results suggest that most existing measures of source spectral slope are unlikely to be good predictors of voice quality.
许多研究人员已经对声源的声学、生理学和感知特征进行了研究,但尽管受到了广泛关注,声源的哪些方面应被量化以及应如何进行测量仍不明确。在本研究中,作者研究了一些现有的声门源谱测量方法之间的关系,以及这些测量方法与整体频谱形状和声门脉冲形状的关联,以确定声源的哪些测量方法能最好地捕捉有关声门脉冲形状和声门源谱的信息。
对70名说话者的语音进行了78种不同的声源频谱形状测量。对测量数据进行主成分分析,并将所得因子与从口语频谱和声门脉冲中类似得出的因子进行比较。
结果显示,现有的声源频谱斜率测量方法存在高度的重复和重叠。此外,现有测量方法与频谱变异性模式的一致性不佳。特别是,现有的频谱测量方法似乎无法充分模拟声源频谱的高频部分。
现有测量方法未能充分量化频谱变异性,这可能解释了为什么研究频谱斜率感知重要性的研究结果并不一致。由于语音信号中的变异性在感知上通常很显著,这些结果表明,大多数现有的声源频谱斜率测量方法不太可能是语音质量的良好预测指标。