利用听觉模型预测正常听力者在噪声环境下的言语识别能力。

Microscopic prediction of speech recognition for listeners with normal hearing in noise using an auditory model.

机构信息

Medizinische Physik, Universitat Oldenburg, D-26111 Oldenburg, Germany.

出版信息

J Acoust Soc Am. 2009 Nov;126(5):2635-48. doi: 10.1121/1.3224721.

PMID:19894841

Abstract

This study compares the phoneme recognition performance in speech-shaped noise of a microscopic model for speech recognition with the performance of normal-hearing listeners. "Microscopic" is defined in terms of this model twofold. First, the speech recognition rate is predicted on a phoneme-by-phoneme basis. Second, microscopic modeling means that the signal waveforms to be recognized are processed by mimicking elementary parts of human's auditory processing. The model is based on an approach by Holube and Kollmeier [J. Acoust. Soc. Am. 100, 1703-1716 (1996)] and consists of a psychoacoustically and physiologically motivated preprocessing and a simple dynamic-time-warp speech recognizer. The model is evaluated while presenting nonsense speech in a closed-set paradigm. Averaged phoneme recognition rates, specific phoneme recognition rates, and phoneme confusions are analyzed. The influence of different perceptual distance measures and of the model's a-priori knowledge is investigated. The results show that human performance can be predicted by this model using an optimal detector, i.e., identical speech waveforms for both training of the recognizer and testing. The best model performance is yielded by distance measures which focus mainly on small perceptual distances and neglect outliers.

摘要

本研究比较了语音识别微观模型与正常听力者在语音噪声中的音位识别性能。“微观”在该模型中有两方面的定义。首先，语音识别率是逐音位预测的。其次，微观建模意味着要识别的信号波形通过模仿人类听觉处理的基本部分进行处理。该模型基于 Holube 和 Kollmeier 的方法[J. Acoust. Soc. Am. 100, 1703-1716 (1996)]，由一个具有心理声学和生理学动机的预处理和一个简单的动态时间扭曲语音识别器组成。该模型在闭集范式中呈现无意义语音时进行评估。分析了平均音位识别率、特定音位识别率和音位混淆。研究了不同感知距离度量和模型先验知识的影响。结果表明，使用最优检测器可以通过该模型预测人类性能，即训练识别器和测试的语音波形完全相同。最佳的模型性能由主要关注小感知距离且忽略异常值的距离度量产生。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

利用听觉模型预测正常听力者在噪声环境下的言语识别能力。

Microscopic prediction of speech recognition for listeners with normal hearing in noise using an auditory model.

机构信息

出版信息

相似文献

引用本文的文献

利用听觉模型预测正常听力者在噪声环境下的言语识别能力。

Microscopic prediction of speech recognition for listeners with normal hearing in noise using an auditory model.

机构信息

出版信息

相似文献

引用本文的文献