Department of Psychological Sciences, University of Connecticut, Storrs, Connecticut, USA.
Department of Psychology, Pennsylvania State University, State College, Pennsylvania, USA.
Ear Hear. 2024;45(2):425-440. doi: 10.1097/AUD.0000000000001438. Epub 2023 Oct 26.
The listening demand incurred by speech perception fluctuates in normal conversation. At the acoustic-phonetic level, natural variation in pronunciation acts as speedbumps to accurate lexical selection. Any given utterance may be more or less phonetically ambiguous-a problem that must be resolved by the listener to choose the correct word. This becomes especially apparent when considering two common speech registers-clear and casual-that have characteristically different levels of phonetic ambiguity. Clear speech prioritizes intelligibility through hyperarticulation which results in less ambiguity at the phonetic level, while casual speech tends to have a more collapsed acoustic space. We hypothesized that listeners would invest greater cognitive resources while listening to casual speech to resolve the increased amount of phonetic ambiguity, as compared with clear speech. To this end, we used pupillometry as an online measure of listening effort during perception of clear and casual continuous speech in two background conditions: quiet and noise.
Forty-eight participants performed a probe detection task while listening to spoken, nonsensical sentences (masked and unmasked) while recording pupil size. Pupil size was modeled using growth curve analysis to capture the dynamics of the pupil response as the sentence unfolded.
Pupil size during listening was sensitive to the presence of noise and speech register (clear/casual). Unsurprisingly, listeners had overall larger pupil dilations during speech perception in noise, replicating earlier work. The pupil dilation pattern for clear and casual sentences was considerably more complex. Pupil dilation during clear speech trials was slightly larger than for casual speech, across quiet and noisy backgrounds.
We suggest that listener motivation could explain the larger pupil dilations to clearly spoken speech. We propose that, bounded by the context of this task, listeners devoted more resources to perceiving the speech signal with the greatest acoustic/phonetic fidelity. Further, we unexpectedly found systematic differences in pupil dilation preceding the onset of the spoken sentences. Together, these data demonstrate that the pupillary system is not merely reactive but also adaptive-sensitive to both task structure and listener motivation to maximize accurate perception in a limited resource system.
言语感知所需的听力需求在正常对话中是波动的。在声学语音层面上,发音的自然变化会对准确的词汇选择造成阻碍。任何给定的话语在语音上可能或多或少地存在歧义——这是一个必须由听者解决的问题,以选择正确的单词。当考虑两种常见的语音语域——清晰语和随意语时,这个问题尤其明显,它们具有明显不同的语音歧义程度。清晰语通过超音段特征来优先考虑可理解度,从而在语音层面上降低歧义,而随意语则倾向于具有更坍缩的声学空间。我们假设,与清晰语相比,听者在听随意语时会投入更多的认知资源来解决增加的语音歧义。为此,我们使用瞳孔测量法作为一种在线测量工具,在安静和噪声两种背景条件下,测量听者在感知清晰和随意连续语音时的听力努力程度。
48 名参与者在听被掩蔽和未被掩蔽的、无意义的句子时执行了探针检测任务,同时记录了瞳孔大小。使用生长曲线分析来模拟瞳孔大小,以捕捉句子展开过程中瞳孔反应的动态。
在噪声和语音语域(清晰/随意)存在的情况下,倾听时的瞳孔大小是敏感的。不出所料,在噪声中听语音时,听者的瞳孔整体扩张较大,这与早期的研究结果一致。清晰和随意句子的瞳孔扩张模式要复杂得多。在安静和嘈杂的背景下,清晰语音试验中的瞳孔扩张比随意语音稍大。
我们认为,听者的动机可以解释对清晰发音的语音更大的瞳孔扩张。我们提出,在这个任务的背景下,听者受动机驱使,投入更多资源来感知具有最大声学/语音保真度的语音信号。此外,我们出乎意料地发现,在说话句子开始之前,瞳孔扩张存在系统差异。这些数据共同表明,瞳孔系统不仅是被动反应的,而且是自适应的,能够根据任务结构和听者的动机来优化在有限资源系统中的准确感知。