Winn Matthew B, Won Jong Ho, Moon Il Joon
1Department of Speech & Hearing Sciences, University of Washington, Seattle, Washington, USA; 2Department of Audiology and Speech Pathology, University of Tennessee Health Science Center, Knoxville, Tennessee, USA; 3Virginia Merrill Bloedel Hearing Research Center, Department of Otolaryngology-Head and Neck Surgery, University of Washington, Seattle, Washington, USA; and 4Department of Otorhinolaryngology-Head and Neck Surgery, Samsung Medical Center, Sungkyunkwan University, School of Medicine, Seoul, Korea.
Ear Hear. 2016 Nov/Dec;37(6):e377-e390. doi: 10.1097/AUD.0000000000000328.
This study was conducted to measure auditory perception by cochlear implant users in the spectral and temporal domains, using tests of either categorization (using speech-based cues) or discrimination (using conventional psychoacoustic tests). The authors hypothesized that traditional nonlinguistic tests assessing spectral and temporal auditory resolution would correspond to speech-based measures assessing specific aspects of phonetic categorization assumed to depend on spectral and temporal auditory resolution. The authors further hypothesized that speech-based categorization performance would ultimately be a superior predictor of speech recognition performance, because of the fundamental nature of speech recognition as categorization.
Nineteen cochlear implant listeners and 10 listeners with normal hearing participated in a suite of tasks that included spectral ripple discrimination, temporal modulation detection, and syllable categorization, which was split into a spectral cue-based task (targeting the /ba/-/da/ contrast) and a timing cue-based task (targeting the /b/-/p/ and /d/-/t/ contrasts). Speech sounds were manipulated to contain specific spectral or temporal modulations (formant transitions or voice onset time, respectively) that could be categorized. Categorization responses were quantified using logistic regression to assess perceptual sensitivity to acoustic phonetic cues. Word recognition testing was also conducted for cochlear implant listeners.
Cochlear implant users were generally less successful at utilizing both spectral and temporal cues for categorization compared with listeners with normal hearing. For the cochlear implant listener group, spectral ripple discrimination was significantly correlated with the categorization of formant transitions; both were correlated with better word recognition. Temporal modulation detection using 100- and 10-Hz-modulated noise was not correlated either with the cochlear implant subjects' categorization of voice onset time or with word recognition. Word recognition was correlated more closely with categorization of the controlled speech cues than with performance on the psychophysical discrimination tasks.
When evaluating people with cochlear implants, controlled speech-based stimuli are feasible to use in tests of auditory cue categorization, to complement traditional measures of auditory discrimination. Stimuli based on specific speech cues correspond to counterpart nonlinguistic measures of discrimination, but potentially show better correspondence with speech perception more generally. The ubiquity of the spectral (formant transition) and temporal (voice onset time) stimulus dimensions across languages highlights the potential to use this testing approach even in cases where English is not the native language.
本研究旨在通过使用分类测试(基于语音线索)或辨别测试(使用传统心理声学测试),来测量人工耳蜗使用者在频谱和时间域的听觉感知。作者假设,评估频谱和时间听觉分辨率的传统非语言测试将与基于语音的测量方法相对应,后者用于评估假定依赖于频谱和时间听觉分辨率的语音分类的特定方面。作者进一步假设,基于语音的分类表现最终将是语音识别表现的更好预测指标,因为语音识别的基本性质就是分类。
19名人工耳蜗聆听者和10名听力正常的聆听者参与了一系列任务,包括频谱纹波辨别、时间调制检测和音节分类,音节分类又分为基于频谱线索的任务(针对/ba/-/da/对比)和基于时间线索的任务(针对/b/-/p/和/d/-/t/对比)。语音被处理以包含特定的频谱或时间调制(分别为共振峰过渡或语音起始时间),这些调制可以被分类。使用逻辑回归对分类反应进行量化,以评估对声学语音线索的感知敏感性。还对人工耳蜗聆听者进行了单词识别测试。
与听力正常的聆听者相比,人工耳蜗使用者在利用频谱和时间线索进行分类方面通常不太成功。对于人工耳蜗聆听者组,频谱纹波辨别与共振峰过渡的分类显著相关;两者都与更好的单词识别相关。使用100赫兹和10赫兹调制噪声的时间调制检测与人工耳蜗受试者对语音起始时间的分类或单词识别均无相关性。单词识别与受控语音线索的分类比与心理物理辨别任务的表现更密切相关。
在评估人工耳蜗使用者时,基于受控语音的刺激可用于听觉线索分类测试,以补充传统的听觉辨别测量方法。基于特定语音线索的刺激与对应的非语言辨别测量方法相对应,但可能更普遍地与语音感知表现出更好的对应关系。频谱(共振峰过渡)和时间(语音起始时间)刺激维度在各种语言中的普遍性凸显了即使在非英语母语的情况下使用这种测试方法的潜力。