Kingston John, Kawahara Shigeto, Chambless Della, Key Michael, Mash Daniel, Watsky Sarah
Linguistics Department, University of Massachusetts, 150 Hicks Way, 226 South College, Amherst, MA, 01003-9274, USA,
Atten Percept Psychophys. 2014 Jul;76(5):1437-64. doi: 10.3758/s13414-013-0593-z.
Three experiments are reported that collectively show that listeners perceive speech sounds as contrasting auditorily with neighboring sounds. Experiment 1 replicates the well-established finding that listeners categorize more of a [d-g] continuum as [g] after [l] than after [r]. Experiments 2 and 3 show that listeners discriminate stimuli in which the energy concentrations differ in frequency between the spectra of neighboring sounds better than those in which they do not differ. In Experiment 2, [alga-arda] pairs, in which the energy concentrations in the liquid-stop sequences are H(igh) L(ow)-LH, were more discriminable than [alda-arga] pairs, in which they are HH-LL. In Experiment 3, [da] and [ga] syllables were more easily discriminated when they were preceded by lower and higher pure tones, respectively-that is, tones that differed from the stops' higher and lower F3 onset frequencies-than when they were preceded by H and L pure tones with similar frequencies. These discrimination results show that contrast with the target's context exaggerates its perceived value when energy concentrations differ in frequency between the target's spectrum and its context's spectrum. Because contrast with its context does more that merely shift the criterion for categorizing the target, it cannot be produced by neural adaptation. The finding that nonspeech contexts exaggerate the perceived values of speech targets also rules out compensation for coarticulation by showing that their values depend on the proximal auditory qualities evoked by the stimuli's acoustic properties, rather than the distal articulatory gestures.
本文报告了三项实验,这些实验共同表明,听众会将语音与相邻声音在听觉上视为不同。实验1重复了一个已得到充分证实的发现,即听众在听到[l]之后,会将[d-g]连续体中更多的音归类为[g],而在听到[r]之后则不然。实验2和实验3表明,听众对相邻声音频谱中能量集中频率不同的刺激的辨别能力,优于对能量集中频率无差异的刺激的辨别能力。在实验2中,液塞音序列中能量集中为高-低-低高的[alga-arda]对,比能量集中为高高-低低的[alda-arga]对更容易辨别。在实验3中,当[da]和[ga]音节之前分别是较低和较高的纯音时——即与塞音的较高和较低第三共振峰起始频率不同的纯音——比当它们之前是频率相似的高和低纯音时,更容易被辨别。这些辨别结果表明,当目标频谱与其上下文频谱的能量集中频率不同时,与目标上下文的对比会夸大其感知值。由于与上下文的对比不仅仅是改变对目标进行分类的标准,所以它不是由神经适应产生的。非语音上下文会夸大语音目标感知值的这一发现,也通过表明目标值取决于刺激声学特性所引发的近端听觉特性,而非远端发音动作,排除了对协同发音的补偿作用。