VA RR&D National Center for Rehabilitative Auditory Research (NCRAR), Portland VA Medical Center, 3710 SW US Veterans Hospital Road, Portland, OR 97207, USA.
J Assoc Res Otolaryngol. 2013 Feb;14(1):125-37. doi: 10.1007/s10162-012-0352-1. Epub 2012 Sep 25.
Vowel identification is largely dependent on listeners' access to the frequency of two or three peaks in the amplitude spectrum. Earlier work has demonstrated that, whereas normal-hearing listeners can identify harmonic complexes with vowel-like spectral shapes even with very little amplitude contrast between "formant" components and remaining harmonic components, listeners with hearing loss require greater amplitude differences. This is likely the result of the poor frequency resolution that often accompanies hearing loss. Here, we describe an additional acoustic dimension for emphasizing formant versus non-formant harmonics that may supplement amplitude contrast information. The purpose of this study was to determine whether listeners were able to identify "vowel-like" sounds using temporal (component phase) contrast, which may be less affected by cochlear loss than spectral cues, and whether overall identification improves when congruent temporal and spectral information are provided together. Five normal-hearing and five hearing-impaired listeners identified three vowels over many presentations. Harmonics representing formant peaks were varied in amplitude, phase, or a combination of both. In addition to requiring less amplitude contrast, normal-hearing listeners could accurately identify the sounds with less phase contrast than required by people with hearing loss. However, both normal-hearing and hearing-impaired groups demonstrated the ability to identify vowel-like sounds based solely on component phase shifts, with no amplitude contrast information, and they also showed improved performance when congruent phase and amplitude cues were combined. For nearly all listeners, the combination of spectral and temporal information improved identification in comparison to either dimension alone.
元音识别在很大程度上取决于听者能否获得幅度谱中两个或三个峰值的频率。早期的研究表明,尽管正常听力的听者即使在“共振峰”成分和其余谐波成分之间的幅度对比度非常小的情况下,也可以识别具有元音样频谱形状的谐波组合,但听力损失的听者需要更大的幅度差异。这很可能是听力损失通常伴随的频率分辨率差的结果。在这里,我们描述了一个用于强调共振峰与非共振峰谐波的附加声学维度,该维度可能补充幅度对比信息。本研究的目的是确定听者是否能够使用时间(成分相位)对比来识别“元音样”声音,这种对比可能比频谱线索受耳蜗损失的影响更小,并且当提供一致的时间和频谱信息时,整体识别是否会提高。五名正常听力和五名听力受损的听者在多次呈现中识别了三个元音。代表共振峰峰值的谐波在幅度、相位或两者的组合上有所变化。除了需要较小的幅度对比之外,正常听力的听者还可以使用比听力受损者所需的相位对比更准确地识别声音。然而,正常听力和听力受损组都表现出仅基于成分相移识别元音样声音的能力,而不需要幅度对比信息,并且当一致的相位和幅度线索结合时,它们的表现也得到了提高。对于几乎所有的听者来说,与单独使用任何一个维度相比,组合使用光谱和时间信息都可以提高识别能力。