听觉频谱-时间调制滤波及语音可懂度预测决策指标的作用。

The role of auditory spectro-temporal modulation filtering and the decision metric for speech intelligibility prediction.

作者信息

Chabot-Leclerc Alexandre, Jørgensen Søren, Dau Torsten

机构信息

Department of Electrical Engineering, Centre for Applied Hearing Research, Technical University of Denmark, DK-2800 Kgs. Lyngby, Denmark.

出版信息

J Acoust Soc Am. 2014 Jun;135(6):3502-12. doi: 10.1121/1.4873517.

DOI:10.1121/1.4873517

PMID:24907813

Abstract

Speech intelligibility models typically consist of a preprocessing part that transforms stimuli into some internal (auditory) representation and a decision metric that relates the internal representation to speech intelligibility. The present study analyzed the role of modulation filtering in the preprocessing of different speech intelligibility models by comparing predictions from models that either assume a spectro-temporal (i.e., two-dimensional) or a temporal-only (i.e., one-dimensional) modulation filterbank. Furthermore, the role of the decision metric for speech intelligibility was investigated by comparing predictions from models based on the signal-to-noise envelope power ratio, SNRenv, and the modulation transfer function, MTF. The models were evaluated in conditions of noisy speech (1) subjected to reverberation, (2) distorted by phase jitter, or (3) processed by noise reduction via spectral subtraction. The results suggested that a decision metric based on the SNRenv may provide a more general basis for predicting speech intelligibility than a metric based on the MTF. Moreover, the one-dimensional modulation filtering process was found to be sufficient to account for the data when combined with a measure of across (audio) frequency variability at the output of the auditory preprocessing. A complex spectro-temporal modulation filterbank might therefore not be required for speech intelligibility prediction.

摘要

语音可懂度模型通常由一个将刺激转换为某种内部（听觉）表征的预处理部分和一个将内部表征与语音可懂度相关联的决策指标组成。本研究通过比较假设存在频谱-时间（即二维）或仅时间（即一维）调制滤波器组的模型的预测，分析了调制滤波在不同语音可懂度模型预处理中的作用。此外，通过比较基于信号与噪声包络功率比（SNRenv）和调制传递函数（MTF）的模型的预测，研究了决策指标对语音可懂度的作用。在以下有噪声语音条件下对模型进行了评估：（1）受到混响影响；（2）因相位抖动而失真；或（3）通过谱减法进行降噪处理。结果表明，基于SNRenv的决策指标可能比基于MTF的指标为预测语音可懂度提供更通用的基础。此外，当与听觉预处理输出处的跨（音频）频率变异性度量相结合时，发现一维调制滤波过程足以解释数据。因此，语音可懂度预测可能不需要复杂的频谱-时间调制滤波器组。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

听觉频谱-时间调制滤波及语音可懂度预测决策指标的作用。

The role of auditory spectro-temporal modulation filtering and the decision metric for speech intelligibility prediction.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

听觉频谱-时间调制滤波及语音可懂度预测决策指标的作用。

The role of auditory spectro-temporal modulation filtering and the decision metric for speech intelligibility prediction.

作者信息

机构信息

出版信息

相似文献

引用本文的文献