• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于调制频率选择性处理后的信噪比包络功率比预测语音可懂度。

Predicting speech intelligibility based on the signal-to-noise envelope power ratio after modulation-frequency selective processing.

机构信息

Centre for Applied Hearing Research, Department of Electrical Engineering, Technical University of Denmark, DK-2800 Kgs. Lyngby, Denmark.

出版信息

J Acoust Soc Am. 2011 Sep;130(3):1475-87. doi: 10.1121/1.3621502.

DOI:10.1121/1.3621502
PMID:21895088
Abstract

A model for predicting the intelligibility of processed noisy speech is proposed. The speech-based envelope power spectrum model has a similar structure as the model of Ewert and Dau [(2000). J. Acoust. Soc. Am. 108, 1181-1196], developed to account for modulation detection and masking data. The model estimates the speech-to-noise envelope power ratio, SNR(env), at the output of a modulation filterbank and relates this metric to speech intelligibility using the concept of an ideal observer. Predictions were compared to data on the intelligibility of speech presented in stationary speech-shaped noise. The model was further tested in conditions with noisy speech subjected to reverberation and spectral subtraction. Good agreement between predictions and data was found in all cases. For spectral subtraction, an analysis of the model's internal representation of the stimuli revealed that the predicted decrease of intelligibility was caused by the estimated noise envelope power exceeding that of the speech. The classical concept of the speech transmission index fails in this condition. The results strongly suggest that the signal-to-noise ratio at the output of a modulation frequency selective process provides a key measure of speech intelligibility.

摘要

提出了一种预测处理噪声语音可懂度的模型。基于语音的包络功率谱模型与 Ewert 和 Dau [(2000)。J. Acoust. Soc. Am. 108, 1181-1196] 开发的调制检测和掩蔽数据模型具有相似的结构。该模型估计调制滤波器组输出处的语音-噪声包络功率比 SNR(env),并使用理想观察者的概念将该度量与语音可懂度联系起来。预测结果与在平稳语音噪声中呈现的语音可懂度数据进行了比较。该模型还在具有混响和频谱减法的噪声语音条件下进行了进一步测试。在所有情况下,预测结果与数据都非常吻合。对于频谱减法,对模型对刺激的内部表示的分析表明,可懂度的预测下降是由于估计的噪声包络功率超过了语音的包络功率。在这种情况下,经典的语音传输指数概念失败了。结果强烈表明,调制频率选择性过程输出处的信噪比提供了语音可懂度的关键度量。

相似文献

1
Predicting speech intelligibility based on the signal-to-noise envelope power ratio after modulation-frequency selective processing.基于调制频率选择性处理后的信噪比包络功率比预测语音可懂度。
J Acoust Soc Am. 2011 Sep;130(3):1475-87. doi: 10.1121/1.3621502.
2
A multi-resolution envelope-power based model for speech intelligibility.基于多分辨率包络功率的语音可懂度模型。
J Acoust Soc Am. 2013 Jul;134(1):436-46. doi: 10.1121/1.4807563.
3
Perceptual effects of noise reduction by time-frequency masking of noisy speech.噪声语音的时频掩蔽降噪的感知效果。
J Acoust Soc Am. 2012 Oct;132(4):2690-9. doi: 10.1121/1.4747006.
4
Prediction of the influence of reverberation on binaural speech intelligibility in noise and in quiet.预测混响对噪声和安静环境下双耳语音可懂度的影响。
J Acoust Soc Am. 2011 Nov;130(5):2999-3012. doi: 10.1121/1.3641368.
5
The role of short-time intensity and envelope power for speech intelligibility and psychoacoustic masking.短时强度和包络功率对语音清晰度及心理声学掩蔽的作用。
J Acoust Soc Am. 2017 Aug;142(2):1098. doi: 10.1121/1.4999059.
6
Intelligibility of reverberant noisy speech with ideal binary masking.用理想二值掩蔽评估混响噪声语音的可懂度。
J Acoust Soc Am. 2011 Oct;130(4):2153-61. doi: 10.1121/1.3631668.
7
The effect of noise envelope modulation on quality judgments of noisy speech.噪声包络调制对噪声语音质量判断的影响。
J Acoust Soc Am. 2012 Oct;132(4):EL277-83. doi: 10.1121/1.4748343.
8
Sentence perception in listening conditions having similar speech intelligibility indices.在言语可懂度指数相似的聆听条件下的句子感知。
Int J Audiol. 2011 Jan;50(1):34-40. doi: 10.3109/14992027.2010.521198. Epub 2010 Nov 4.
9
The potential of onset enhancement for increased speech intelligibility in auditory prostheses.声刺激起始增强在听觉假体中提高言语可懂度的潜力
J Acoust Soc Am. 2012 Oct;132(4):2569-81. doi: 10.1121/1.4748965.
10
Comparison of fluctuating maskers for speech recognition tests.比较用于语音识别测试的波动掩蔽器。
Int J Audiol. 2011 Jan;50(1):2-13. doi: 10.3109/14992027.2010.505582. Epub 2010 Nov 23.

引用本文的文献

1
FrAMBI: A Software Framework for Auditory Modeling Based on Bayesian Inference.FrAMBI:一种基于贝叶斯推理的听觉建模软件框架。
Neuroinformatics. 2025 Feb 10;23(2):20. doi: 10.1007/s12021-024-09702-5.
2
Neurometric amplitude modulation detection in the inferior colliculus of Young and Aged rats.年轻和老年大鼠下丘脑中的神经测量幅度调制检测。
Hear Res. 2024 Jun;447:109028. doi: 10.1016/j.heares.2024.109028. Epub 2024 May 3.
3
Original speech and its echo are segregated and separately processed in the human brain.人脑中会对原声及其回音进行分隔并分别处理。
PLoS Biol. 2024 Feb 15;22(2):e3002498. doi: 10.1371/journal.pbio.3002498. eCollection 2024 Feb.
4
Neural Fluctuation Contrast as a Code for Complex Sounds: The Role and Control of Peripheral Nonlinearities.神经波动对比作为复杂声音的代码:外围非线性的作用和控制。
Hear Res. 2024 Mar 1;443:108966. doi: 10.1016/j.heares.2024.108966. Epub 2024 Feb 1.
5
Sentence recognition with modulation-filtered speech segments for younger and older adults: Effects of hearing impairment and cognition.调制滤波语音段对年轻和老年成年人的句子识别:听力障碍和认知的影响。
J Acoust Soc Am. 2023 Nov 1;154(5):3328-3343. doi: 10.1121/10.0022445.
6
Adaptive mechanisms facilitate robust performance in noise and in reverberation in an auditory categorization model.自适应机制使听觉分类模型在噪声和混响中具有强大的性能。
Commun Biol. 2023 May 2;6(1):456. doi: 10.1038/s42003-023-04816-z.
7
Computational modeling of the human compound action potential.人体复合动作电位的计算建模。
J Acoust Soc Am. 2023 Apr 1;153(4):2376. doi: 10.1121/10.0017863.
8
Microscopic and Blind Prediction of Speech Intelligibility: Theory and Practice.语音可懂度的微观与盲预测:理论与实践
IEEE/ACM Trans Audio Speech Lang Process. 2022;30:2141-2155. doi: 10.1109/taslp.2022.3184888. Epub 2022 Jun 30.
9
Extending the Hearing-Aid Speech Perception Index (HASPI): Keywords, sentences, and context.扩展助听器言语感知指数(HASPI):关键词、句子和语境。
J Acoust Soc Am. 2023 Mar;153(3):1662. doi: 10.1121/10.0017546.
10
Quantitative models of auditory cortical processing.听觉皮层处理的定量模型。
Hear Res. 2023 Mar 1;429:108697. doi: 10.1016/j.heares.2023.108697. Epub 2023 Jan 14.