• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

听觉频谱-时间调制滤波及语音可懂度预测决策指标的作用。

The role of auditory spectro-temporal modulation filtering and the decision metric for speech intelligibility prediction.

作者信息

Chabot-Leclerc Alexandre, Jørgensen Søren, Dau Torsten

机构信息

Department of Electrical Engineering, Centre for Applied Hearing Research, Technical University of Denmark, DK-2800 Kgs. Lyngby, Denmark.

出版信息

J Acoust Soc Am. 2014 Jun;135(6):3502-12. doi: 10.1121/1.4873517.

DOI:10.1121/1.4873517
PMID:24907813
Abstract

Speech intelligibility models typically consist of a preprocessing part that transforms stimuli into some internal (auditory) representation and a decision metric that relates the internal representation to speech intelligibility. The present study analyzed the role of modulation filtering in the preprocessing of different speech intelligibility models by comparing predictions from models that either assume a spectro-temporal (i.e., two-dimensional) or a temporal-only (i.e., one-dimensional) modulation filterbank. Furthermore, the role of the decision metric for speech intelligibility was investigated by comparing predictions from models based on the signal-to-noise envelope power ratio, SNRenv, and the modulation transfer function, MTF. The models were evaluated in conditions of noisy speech (1) subjected to reverberation, (2) distorted by phase jitter, or (3) processed by noise reduction via spectral subtraction. The results suggested that a decision metric based on the SNRenv may provide a more general basis for predicting speech intelligibility than a metric based on the MTF. Moreover, the one-dimensional modulation filtering process was found to be sufficient to account for the data when combined with a measure of across (audio) frequency variability at the output of the auditory preprocessing. A complex spectro-temporal modulation filterbank might therefore not be required for speech intelligibility prediction.

摘要

语音可懂度模型通常由一个将刺激转换为某种内部(听觉)表征的预处理部分和一个将内部表征与语音可懂度相关联的决策指标组成。本研究通过比较假设存在频谱-时间(即二维)或仅时间(即一维)调制滤波器组的模型的预测,分析了调制滤波在不同语音可懂度模型预处理中的作用。此外,通过比较基于信号与噪声包络功率比(SNRenv)和调制传递函数(MTF)的模型的预测,研究了决策指标对语音可懂度的作用。在以下有噪声语音条件下对模型进行了评估:(1)受到混响影响;(2)因相位抖动而失真;或(3)通过谱减法进行降噪处理。结果表明,基于SNRenv的决策指标可能比基于MTF的指标为预测语音可懂度提供更通用的基础。此外,当与听觉预处理输出处的跨(音频)频率变异性度量相结合时,发现一维调制滤波过程足以解释数据。因此,语音可懂度预测可能不需要复杂的频谱-时间调制滤波器组。

相似文献

1
The role of auditory spectro-temporal modulation filtering and the decision metric for speech intelligibility prediction.听觉频谱-时间调制滤波及语音可懂度预测决策指标的作用。
J Acoust Soc Am. 2014 Jun;135(6):3502-12. doi: 10.1121/1.4873517.
2
Modelling speech intelligibility in adverse conditions.在不利条件下的言语可懂度建模。
Adv Exp Med Biol. 2013;787:343-51. doi: 10.1007/978-1-4614-1590-9_38.
3
Predicting speech intelligibility based on the signal-to-noise envelope power ratio after modulation-frequency selective processing.基于调制频率选择性处理后的信噪比包络功率比预测语音可懂度。
J Acoust Soc Am. 2011 Sep;130(3):1475-87. doi: 10.1121/1.3621502.
4
Effects of manipulating the signal-to-noise envelope power ratio on speech intelligibility.操纵信噪包络功率比对言语可懂度的影响。
J Acoust Soc Am. 2015 Mar;137(3):1401-10. doi: 10.1121/1.4908240.
5
A multi-resolution envelope-power based model for speech intelligibility.基于多分辨率包络功率的语音可懂度模型。
J Acoust Soc Am. 2013 Jul;134(1):436-46. doi: 10.1121/1.4807563.
6
Speech Intelligibility Prediction using Spectro-Temporal Modulation Analysis.基于频谱-时间调制分析的语音可懂度预测
IEEE/ACM Trans Audio Speech Lang Process. 2021;29:210-225. doi: 10.1109/taslp.2020.3039929. Epub 2020 Nov 24.
7
Speech intelligibility prediction based on modulation frequency-selective processing.基于调制频率选择处理的语音可懂度预测。
Hear Res. 2022 Dec;426:108610. doi: 10.1016/j.heares.2022.108610. Epub 2022 Sep 13.
8
Predicting speech intelligibility based on a correlation metric in the envelope power spectrum domain.基于包络功率谱域中的相关度量预测语音可懂度。
J Acoust Soc Am. 2016 Oct;140(4):2670. doi: 10.1121/1.4964505.
9
Prediction of the influence of reverberation on binaural speech intelligibility in noise and in quiet.预测混响对噪声和安静环境下双耳语音可懂度的影响。
J Acoust Soc Am. 2011 Nov;130(5):2999-3012. doi: 10.1121/1.3641368.
10
Spectro-temporal modulation energy based mask for robust speaker identification.基于谱时调制能量的掩蔽稳健说话人识别。
J Acoust Soc Am. 2012 May;131(5):EL368-74. doi: 10.1121/1.3697534.

引用本文的文献

1
Relating Suprathreshold Auditory Processing Abilities to Speech Understanding in Competition.将阈上听觉处理能力与竞争环境下的言语理解相关联。
Brain Sci. 2022 May 27;12(6):695. doi: 10.3390/brainsci12060695.
2
Mechanisms of Spectrotemporal Modulation Detection for Normal- and Hearing-Impaired Listeners.正常听力者和听力障碍者的光谱时变调制检测机制。
Trends Hear. 2021 Jan-Dec;25:2331216520978029. doi: 10.1177/2331216520978029.
3
Effects of Expanding Envelope Fluctuations on Consonant Perception in Hearing-Impaired Listeners.扩展包络波动对听力受损者辅音感知的影响。
Trends Hear. 2018 Jan-Dec;22:2331216518775293. doi: 10.1177/2331216518775293.
4
Comparing the information conveyed by envelope modulation for speech intelligibility, speech quality, and music quality.比较通过包络调制传达的信息在语音清晰度、语音质量和音乐质量方面的表现。
J Acoust Soc Am. 2015 Oct;138(4):2470-82. doi: 10.1121/1.4931899.