Quality and Usability Lab, Technische Universität Berlin, D-10587 Berlin, Germany.
J Neural Eng. 2019 Jun;16(3):036009. doi: 10.1088/1741-2552/aaf122. Epub 2018 Nov 15.
By means of subjective psychophysical methods, quality of transmitted speech has been decomposed into three perceptual dimensions named 'discontinuity' (F), 'noisiness' (N) and 'coloration' (C). Previous studies using electroencephalography (EEG) already reported effects of perceived intensity of single quality dimensions on electrical brain activity. However, it has not been investigated so far, whether the dimensions themselves are dissociable on a neurophysiological level of analysis.
Pursuing this goal in the present study, a high-quality (HQ) recording of a spoken word was degraded on each dimension at a time, resulting in three quality-impaired stimuli (F, N, C) which were on average described as being equal in perceived degradation intensity. Participants performed a three-stimulus oddball task, involving the serial presentation of different stimulus types: (1) HQ or degraded 'standard' stimuli to establish sensory/perceptual quality references. (2) Degraded 'oddball' stimuli to cause random, infrequent deviations from those references. EEG was employed to examine the neuro-electrical correlates of speech quality perception.
Emphasis was placed on modulations in temporal and morphological characteristics of the P300 component of the event-related brain potential (ERP), whose subcomponents P3a and P3b are commonly linked to attentional orienting and task relevance categorization, respectively. Electrophysiological data analysis ([Formula: see text]) revealed significant modulations of P300 amplitude and latency by the perceptual dimensions underlying both quality references and oddball stimuli.
The present study exemplifies the utility of physiological methods like EEG for dissociating speech degradations not only based on perceived intensity level, but also their distinctive quality dimension.
通过主观心理物理方法,将语音质量分解为三个可感知的维度,分别命名为“不连续性”(F)、“噪声”(N)和“染色”(C)。先前使用脑电图(EEG)的研究已经报道了感知到的单一质量维度的强度对大脑电活动的影响。然而,迄今为止,尚未研究这些维度本身在神经生理分析水平上是否可以分离。
在本研究中,为了达到这一目的,一次对一个语音进行高质量(HQ)录制,使其在每个维度上都受到降级,从而产生三个质量受损的刺激(F、N、C),这些刺激在感知到的降级强度上平均被描述为相等。参与者执行了一个三刺激Oddball 任务,涉及不同刺激类型的连续呈现:(1)HQ 或降级的“标准”刺激,以建立感觉/感知质量参考。(2)降级的“异常”刺激,以引起与这些参考值的随机、不频繁偏离。EEG 用于检查语音质量感知的神经电相关性。
重点放在事件相关脑电位(ERP)中 P300 成分的时间和形态特征的调制上,其子成分 P3a 和 P3b 通常与注意力定向和任务相关性分类有关。电生理数据分析([公式:见正文])揭示了 P300 幅度和潜伏期的显著调制,这些调制与质量参考和异常刺激的感知维度有关。
本研究示例说明了生理方法(如 EEG)在分离语音降级方面的效用,不仅可以基于感知强度水平,还可以基于其独特的质量维度。