Department of Psychology, Royal Holloway, University of London Egham, UK ; Institute of Cognitive Neuroscience, University College London London, UK.
Department of Speech, Hearing and Phonetic Sciences, University College London London, UK.
Front Syst Neurosci. 2014 Feb 25;8:18. doi: 10.3389/fnsys.2014.00018. eCollection 2014.
Noise-vocoding is a transformation which, when applied to speech, severely reduces spectral resolution and eliminates periodicity, yielding a stimulus that sounds "like a harsh whisper" (Scott et al., 2000, p. 2401). This process simulates a cochlear implant, where the activity of many thousand hair cells in the inner ear is replaced by direct stimulation of the auditory nerve by a small number of tonotopically-arranged electrodes. Although a cochlear implant offers a powerful means of restoring some degree of hearing to profoundly deaf individuals, the outcomes for spoken communication are highly variable (Moore and Shannon, 2009). Some variability may arise from differences in peripheral representation (e.g., the degree of residual nerve survival) but some may reflect differences in higher-order linguistic processing. In order to explore this possibility, we used noise-vocoding to explore speech recognition and perceptual learning in normal-hearing listeners tested across several levels of the linguistic hierarchy: segments (consonants and vowels), single words, and sentences. Listeners improved significantly on all tasks across two test sessions. In the first session, individual differences analyses revealed two independently varying sources of variability: one lexico-semantic in nature and implicating the recognition of words and sentences, and the other an acoustic-phonetic factor associated with words and segments. However, consequent to learning, by the second session there was a more uniform covariance pattern concerning all stimulus types. A further analysis of phonetic feature recognition allowed greater insight into learning-related changes in perception and showed that, surprisingly, participants did not make full use of cues that were preserved in the stimuli (e.g., vowel duration). We discuss these findings in relation cochlear implantation, and suggest auditory training strategies to maximize speech recognition performance in the absence of typical cues.
噪声声码化是一种变换,当应用于语音时,会严重降低频谱分辨率并消除周期性,从而产生听起来“像刺耳的低语”的刺激(Scott 等人,2000 年,第 2401 页)。这个过程模拟了耳蜗植入物,其中内耳中数千个毛细胞的活动被少数按音调排列的电极直接刺激听觉神经所取代。尽管耳蜗植入物为重度耳聋患者提供了恢复一定程度听力的有力手段,但口语交流的结果却高度可变(Moore 和 Shannon,2009 年)。一些可变性可能来自于外围表示的差异(例如,剩余神经存活的程度),但一些可能反映了更高层次语言处理的差异。为了探索这种可能性,我们使用噪声声码化来探索正常听力听众在语言层次的几个层次上的语音识别和感知学习:片段(辅音和元音)、单个单词和句子。听众在两个测试会话中在所有任务上都显著提高。在第一次会议上,个体差异分析揭示了两个独立变化的可变性来源:一个是词汇语义性质,涉及单词和句子的识别,另一个是与单词和片段相关的声学语音因素。然而,由于学习,到第二次会议时,所有刺激类型的协方差模式更加统一。对语音特征识别的进一步分析提供了对感知相关变化的更深入了解,并令人惊讶地表明,参与者并没有充分利用在刺激中保留的线索(例如,元音持续时间)。我们讨论了这些发现与耳蜗植入物的关系,并提出了听觉训练策略,以在没有典型线索的情况下最大限度地提高语音识别性能。