Choi Jeung-Yoon, Hasegawa-Johnson Mark, Cole Jennifer
Department of Electrical and Computer Engineering, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, USA.
J Acoust Soc Am. 2005 Oct;118(4):2579-87. doi: 10.1121/1.2010288.
Acoustic cues related to the voice source, including harmonic structure and spectral tilt, were examined for relevance to prosodic boundary detection. The measurements considered here comprise five categories: duration, pitch, harmonic structure, spectral tilt, and amplitude. Distributions of the measurements and statistical analysis show that the measurements may be used to differentiate between prosodic categories. Detection experiments on the Boston University Radio Speech Corpus show equal error detection rates around 70% for accent and boundary detection, using only the acoustic measurements described, without any lexical or syntactic information. Further investigation of the detection results shows that duration and amplitude measurements, and, to a lesser degree, pitch measurements, are useful for detecting accents, while all voice source measurements except pitch measurements are useful for boundary detection.
研究了与声源相关的声学线索,包括谐波结构和频谱倾斜度,以探讨其与韵律边界检测的相关性。这里所考虑的测量包括五类:时长、音高、谐波结构、频谱倾斜度和幅度。测量结果的分布及统计分析表明,这些测量可用于区分不同的韵律类别。在波士顿大学广播语音语料库上进行的检测实验表明,仅使用所描述的声学测量,而不使用任何词汇或句法信息时,重音和边界检测的错误检测率均约为70%。对检测结果的进一步研究表明,时长和幅度测量,以及在较小程度上的音高测量,对检测重音有用,而除音高测量外的所有声源测量对边界检测都有用。