Brodbeck Christian, Simon Jonathan Z
Department of Psychological Sciences, University of Connecticut, Storrs, CT, United States.
Institute for Systems Research, University of Maryland, College Park, College Park, MD, United States.
Front Neurosci. 2022 Aug 8;16:828546. doi: 10.3389/fnins.2022.828546. eCollection 2022.
Voice pitch carries linguistic and non-linguistic information. Previous studies have described cortical tracking of voice pitch in clean speech, with responses reflecting both pitch strength and pitch value. However, pitch is also a powerful cue for auditory stream segregation, especially when competing streams have pitch differing in fundamental frequency, as is the case when multiple speakers talk simultaneously. We therefore investigated how cortical speech pitch tracking is affected in the presence of a second, task-irrelevant speaker. We analyzed human magnetoencephalography (MEG) responses to continuous narrative speech, presented either as a single talker in a quiet background or as a two-talker mixture of a male and a female speaker. In clean speech, voice pitch was associated with a right-dominant response, peaking at a latency of around 100 ms, consistent with previous electroencephalography and electrocorticography results. The response tracked both the presence of pitch and the relative value of the speaker's fundamental frequency. In the two-talker mixture, the pitch of the attended speaker was tracked bilaterally, regardless of whether or not there was simultaneously present pitch in the speech of the irrelevant speaker. Pitch tracking for the irrelevant speaker was reduced: only the right hemisphere still significantly tracked pitch of the unattended speaker, and only during intervals in which no pitch was present in the attended talker's speech. Taken together, these results suggest that pitch-based segregation of multiple speakers, at least as measured by macroscopic cortical tracking, is not entirely automatic but strongly dependent on selective attention.
音高承载着语言和非语言信息。先前的研究描述了在纯净语音中大脑皮层对音高的追踪,其反应反映了音高强度和音高值。然而,音高也是听觉流分离的一个有力线索,特别是当竞争流的音高在基频上不同时,就像多个说话者同时交谈的情况。因此,我们研究了在存在第二个与任务无关的说话者的情况下,大脑皮层对语音音高的追踪是如何受到影响的。我们分析了人类脑磁图(MEG)对连续叙述性语音的反应,语音呈现方式要么是在安静背景中的单个说话者,要么是男性和女性说话者的双说话者混合语音。在纯净语音中,音高与右侧优势反应相关,在大约100毫秒的潜伏期达到峰值,这与先前的脑电图和皮层脑电图结果一致。该反应追踪了音高的存在以及说话者基频的相对值。在双说话者混合语音中,被关注说话者的音高在双侧都被追踪,无论无关说话者的语音中是否同时存在音高。对无关说话者的音高追踪减少:只有右半球仍能显著追踪未被关注说话者的音高,并且仅在被关注说话者的语音中不存在音高的时间段内。综上所述,这些结果表明,至少通过宏观皮层追踪测量的基于音高的多个说话者分离并非完全自动,而是强烈依赖于选择性注意。