Department of Speech and Hearing Science, University of Illinois Urbana-Champaign, Champaign, Illinois 61820, USA.
Department of Otolaryngology/HNS, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27514, USA.
J Acoust Soc Am. 2022 Sep;152(3):1639. doi: 10.1121/10.0013993.
The use of spectrally degraded speech signals deprives listeners of acoustic information that is useful for speech perception. Several popular speech corpora, recorded decades ago, have spectral degradations, including limited extended high-frequency (EHF) (>8 kHz) content. Although frequency content above 8 kHz is often assumed to play little or no role in speech perception, recent research suggests that EHF content in speech can have a significant beneficial impact on speech perception under a wide range of natural listening conditions. This paper provides an analysis of the spectral content of popular speech corpora used for speech perception research to highlight the potential shortcomings of using bandlimited speech materials. Two corpora analyzed here, the TIMIT and NU-6, have substantial low-frequency spectral degradation (<500 Hz) in addition to EHF degradation. We provide an overview of the phenomena potentially missed by using bandlimited speech signals, and the factors to consider when selecting stimuli that are sensitive to these effects.
使用频谱降级的语音信号会剥夺听众对语音感知有用的声学信息。一些几十年前录制的流行语音语料库都存在频谱降级,包括有限的扩展高频(EHF)(>8 kHz)内容。尽管通常认为 8 kHz 以上的频率内容在语音感知中作用不大或没有作用,但最近的研究表明,语音中的 EHF 内容在广泛的自然聆听条件下对语音感知有显著的有益影响。本文对用于语音感知研究的流行语音语料库的频谱内容进行了分析,以突出使用限带语音材料的潜在缺陷。本文分析的两个语料库,TIMIT 和 NU-6,除了 EHF 降级之外,还存在低频谱降级(<500 Hz)。我们提供了一个使用限带语音信号可能错过的现象概述,以及在选择对这些影响敏感的刺激时需要考虑的因素。