Suppr超能文献

不同噪声环境下语音的视听增强与 SNR 的相关性:一项结合行为和电生理的研究。

Correlation between audio-visual enhancement of speech in different noise environments and SNR: a combined behavioral and electrophysiological study.

机构信息

School of Computer Science and Technology, Tianjin Key Laboratory of Cognitive Computing and Application, Tianjin University, Tianjin 300072, PR China.

出版信息

Neuroscience. 2013 Sep 5;247:145-51. doi: 10.1016/j.neuroscience.2013.05.007. Epub 2013 May 11.

Abstract

In the present study, we investigated the multisensory gain as the difference of speech recognition accuracies between the audio-visual (AV) and auditory-only (A) conditions, and the multisensory gain as the difference between the event-related potentials (ERPs) evoked under the AV condition and the sum of the ERPs evoked under the A and visual-only (V) conditions in different noise environments. Videos of a female speaker articulating the Chinese monosyllable words accompanied with different levels of pink noise were used as the stimulus materials. The selected signal-to-noise ratios (SNRs) were -16, -12, -8, -4 and 0 dB. Under the A, V and AV conditions the accuracy of the speech recognition was measured and the ERPs evoked under different conditions were analyzed, respectively. The behavioral results showed that the maximum gain as the difference of speech recognition accuracies between the AV and A conditions was at the -12 dB SNR. The ERP results showed that the multisensory gain as the difference between the ERPs evoked under the AV condition and the sum of ERPs evoked under the A and V conditions at the -12 dB SNR was significantly higher than those at the other SNRs in the time window of 130-200 ms in the area from frontal to central region. The multisensory gains in audio-visual speech recognition at different SNRs were not completely accordant with the principle of inverse effectiveness, but confirmed to cross-modal stochastic resonance.

摘要

在本研究中,我们研究了多感觉增益,即视听(AV)和仅听觉(A)条件下语音识别准确率的差异,以及在不同噪声环境下,AV 条件下诱发的事件相关电位(ERP)与 A 和仅视觉(V)条件下诱发的 ERP 之和之间的多感觉增益。使用带有不同水平粉红噪声的女性说话者发音的中文单音节词的视频作为刺激材料。选择的信噪比(SNR)分别为-16、-12、-8、-4 和 0dB。在 A、V 和 AV 条件下,测量了语音识别的准确率,并分别分析了不同条件下诱发的 ERP。行为结果表明,AV 和 A 条件下语音识别准确率差异的最大增益出现在-12dB SNR。ERP 结果表明,在 130-200ms 的时间窗口内,在额区到中央区的区域,-12dB SNR 时 AV 条件下诱发的 ERP 与 A 和 V 条件下诱发的 ERP 之和之间的多感觉增益明显高于其他 SNR 的多感觉增益。不同 SNR 下视听语音识别的多感觉增益与反效性原则不完全一致,但证实了跨模态随机共振。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验