• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

影响理想二元掩蔽语音可懂度的因素:对降噪的启示

Factors influencing intelligibility of ideal binary-masked speech: implications for noise reduction.

作者信息

Li Ning, Loizou Philipos C

机构信息

Department of Electrical Engineering, University of Texas at Dallas, Richardson, Texas 75083-0688, USA.

出版信息

J Acoust Soc Am. 2008 Mar;123(3):1673-82. doi: 10.1121/1.2832617.

DOI:10.1121/1.2832617
PMID:18345855
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2696360/
Abstract

The application of the ideal binary mask to an auditory mixture has been shown to yield substantial improvements in intelligibility. This mask is commonly applied to the time-frequency (T-F) representation of a mixture signal and eliminates portions of a signal below a signal-to-noise-ratio (SNR) threshold while allowing others to pass through intact. The factors influencing intelligibility of ideal binary-masked speech are not well understood and are examined in the present study. Specifically, the effects of the local SNR threshold, input SNR level, masker type, and errors introduced in estimating the ideal mask are examined. Consistent with previous studies, intelligibility of binary-masked stimuli is quite high even at -10 dB SNR for all maskers tested. Performance was affected the most when the masker dominated T-F units were wrongly labeled as target-dominated T-F units. Performance plateaued near 100% correct for SNR thresholds ranging from -20 to 5 dB. The existence of the plateau region suggests that it is the pattern of the ideal binary mask that matters the most rather than the local SNR of each T-F unit. This pattern directs the listener's attention to where the target is and enables them to segregate speech effectively in multitalker environments.

摘要

将理想二元掩蔽应用于听觉混合信号已被证明能显著提高可懂度。这种掩蔽通常应用于混合信号的时频(T-F)表示,它会消除信号中低于信噪比(SNR)阈值的部分,同时让其他部分完整通过。影响理想二元掩蔽语音可懂度的因素尚未得到很好的理解,本研究对此进行了探讨。具体而言,研究了局部SNR阈值、输入SNR水平、掩蔽类型以及估计理想掩蔽时引入的误差的影响。与先前的研究一致,对于所有测试的掩蔽,即使在SNR为-10 dB时,二元掩蔽刺激的可懂度也相当高。当掩蔽主导的T-F单元被错误标记为目标主导的T-F单元时,性能受到的影响最大。对于SNR阈值从-20 dB到5 dB的范围,性能在接近100%正确时趋于平稳。平稳区域的存在表明,最重要的是理想二元掩蔽的模式,而不是每个T-F单元的局部SNR。这种模式将听众的注意力引向目标所在位置,并使他们能够在多说话者环境中有效地分离语音。

相似文献

1
Factors influencing intelligibility of ideal binary-masked speech: implications for noise reduction.影响理想二元掩蔽语音可懂度的因素:对降噪的启示
J Acoust Soc Am. 2008 Mar;123(3):1673-82. doi: 10.1121/1.2832617.
2
Role of mask pattern in intelligibility of ideal binary-masked noisy speech.掩码模式在理想二元掩码噪声语音可懂度中的作用。
J Acoust Soc Am. 2009 Sep;126(3):1415-26. doi: 10.1121/1.3179673.
3
Evaluation of the importance of time-frequency contributions to speech intelligibility in noise.评估时频因素对噪声环境下言语可懂度的重要性。
J Acoust Soc Am. 2014 May;135(5):3007-16. doi: 10.1121/1.4869088.
4
An algorithm that improves speech intelligibility in noise for normal-hearing listeners.一种可提高听力正常的听众在噪声环境中语音清晰度的算法。
J Acoust Soc Am. 2009 Sep;126(3):1486-94. doi: 10.1121/1.3184603.
5
Perceptual effects of noise reduction by time-frequency masking of noisy speech.噪声语音的时频掩蔽降噪的感知效果。
J Acoust Soc Am. 2012 Oct;132(4):2690-9. doi: 10.1121/1.4747006.
6
Effect of spectral resolution on the intelligibility of ideal binary masked speech.光谱分辨率对理想二元掩蔽语音可懂度的影响。
J Acoust Soc Am. 2008 Apr;123(4):EL59-64. doi: 10.1121/1.2884086.
7
Speech intelligibility in reverberation with ideal binary masking: effects of early reflections and signal-to-noise ratio threshold.混响环境下理想二值掩蔽对言语可懂度的影响:早期反射声和信噪比阈的作用。
J Acoust Soc Am. 2013 Mar;133(3):1707-17. doi: 10.1121/1.4789895.
8
Intelligibility of reverberant noisy speech with ideal binary masking.用理想二值掩蔽评估混响噪声语音的可懂度。
J Acoust Soc Am. 2011 Oct;130(4):2153-61. doi: 10.1121/1.3631668.
9
Comparison of ideal mask-based speech enhancement algorithms for speech mixed with white noise at low mixture signal-to-noise ratios.低混合信噪比下与白噪声混合语音的理想基于掩码语音增强算法比较
J Acoust Soc Am. 2022 Dec;152(6):3458. doi: 10.1121/10.0016494.
10
Improvement of intelligibility of ideal binary-masked noisy speech by adding background noise.添加背景噪声可提高理想二值掩蔽噪声语音的可懂度。
J Acoust Soc Am. 2011 Apr;129(4):2227-36. doi: 10.1121/1.3559707.

引用本文的文献

1
Domain general noise reduction for time series signals with Noisereduce.使用Noisereduce对时间序列信号进行通用域降噪。
Sci Rep. 2025 Aug 22;15(1):30905. doi: 10.1038/s41598-025-13108-x.
2
An ideal compressed mask for increasing speech intelligibility without sacrificing environmental sound recognitiona).一种理想的压缩式面罩,用于在不牺牲环境声音识别能力的情况下提高语音清晰度a)。
J Acoust Soc Am. 2024 Dec 1;156(6):3958-3969. doi: 10.1121/10.0034599.
3
The effects of estimation accuracy, estimation approach, and number of selected channels using formant-priority channel selection for an "n-of-m" sound processing strategy for cochlear implants.使用共振峰优先级通道选择对 Cochlear 植入物的“n-of-m”声音处理策略进行估计准确性、估计方法和选择通道数量的影响。
J Acoust Soc Am. 2023 May 1;153(5):3100. doi: 10.1121/10.0019416.
4
The Application of Time-Frequency Masking To Improve Intelligibility of Dysarthric Speech in Background Noise.时频掩蔽在背景噪声下改善构音障碍语音可懂度的应用。
J Speech Lang Hear Res. 2023 May 9;66(5):1853-1866. doi: 10.1044/2023_JSLHR-22-00558. Epub 2023 Mar 21.
5
Parameter tuning of time-frequency masking algorithms for reverberant artifact removal within the cochlear implant stimulus.参数调整的时频掩蔽算法的混响伪影去除在耳蜗植入刺激。
Cochlear Implants Int. 2022 Nov;23(6):309-316. doi: 10.1080/14670100.2022.2096182. Epub 2022 Jul 23.
6
Older adult recognition error patterns when listening to interrupted speech and speech in steady-state noise.老年人在听中断的言语和稳态噪声中的言语时的识别错误模式。
J Acoust Soc Am. 2021 Nov;150(5):3428. doi: 10.1121/10.0006975.
7
Investigating the Effects of Four Auditory Profiles on Speech Recognition, Overall Quality, and Noise Annoyance With Simulated Hearing-Aid Processing Strategies.探究四种听觉轮廓对言语识别、整体质量和噪声烦恼的影响,以及模拟助听器处理策略。
Trends Hear. 2020 Jan-Dec;24:2331216520960861. doi: 10.1177/2331216520960861.
8
Spectro-temporal glimpsing of speech in noise: Regularity and coherence of masking patterns reduces uncertainty and increases intelligibility.语音在噪声中的时频窥探:掩蔽模式的规律性和连贯性降低了不确定性,提高了可懂度。
J Acoust Soc Am. 2020 Sep;148(3):1552. doi: 10.1121/10.0001971.
9
Formant Frequency-based Speech Enhancement Technique to improve Intelligibility for hearing aid users with smartphone as an assistive device.基于共振峰频率的语音增强技术,以提高使用智能手机作为辅助设备的助听器用户的语音清晰度。
Health Innov Point Care Conf. 2017 Nov;2017:32-35. doi: 10.1109/hic.2017.8227577. Epub 2017 Dec 21.
10
The optimal threshold for removing noise from speech is similar across normal and impaired hearing-a time-frequency masking study.从语音中去除噪声的最佳阈值在正常和听力障碍人群中相似——时频掩蔽研究。
J Acoust Soc Am. 2019 Jun;145(6):EL581. doi: 10.1121/1.5112828.

本文引用的文献

1
Monaural speech segregation based on pitch tracking and amplitude modulation.基于音高跟踪和幅度调制的单耳语音分离
IEEE Trans Neural Netw. 2004 Sep;15(5):1135-50. doi: 10.1109/TNN.2004.832812.
2
Subjective comparison and evaluation of speech enhancement algorithms.语音增强算法的主观比较与评估
Speech Commun. 2007 Jul;49(7):588-601. doi: 10.1016/j.specom.2006.12.006.
3
Factors influencing glimpsing of speech in noise.影响噪声中言语感知的因素。
J Acoust Soc Am. 2007 Aug;122(2):1165-72. doi: 10.1121/1.2749454.
4
Isolating the energetic component of speech-on-speech masking with ideal time-frequency segregation.利用理想的时频分离来分离语音对语音掩蔽中的能量成分。
J Acoust Soc Am. 2006 Dec;120(6):4007-18. doi: 10.1121/1.2363929.
5
Determination of the potential benefit of time-frequency gain manipulation.时频增益操纵潜在益处的测定
Ear Hear. 2006 Oct;27(5):480-92. doi: 10.1097/01.aud.0000233891.86809.df.
6
Pitch-based monaural segregation of reverberant speech.基于基频的混响语音单声道分离
J Acoust Soc Am. 2006 Jul;120(1):458-69. doi: 10.1121/1.2204590.
7
A glimpsing model of speech perception in noise.一种噪声中语音感知的一瞥模型。
J Acoust Soc Am. 2006 Mar;119(3):1562-73. doi: 10.1121/1.2166600.
8
Speech segregation based on sound localization.基于声音定位的语音分离。
J Acoust Soc Am. 2003 Oct;114(4 Pt 1):2236-52. doi: 10.1121/1.1610463.
9
Informational and energetic masking effects in the perception of multiple simultaneous talkers.多个同时说话者感知中的信息性和能量掩蔽效应。
J Acoust Soc Am. 2001 Nov;110(5 Pt 1):2527-38. doi: 10.1121/1.1408946.
10
Informational and energetic masking effects in the perception of two simultaneous talkers.同时感知两个说话者时的信息性和能量性掩蔽效应。
J Acoust Soc Am. 2001 Mar;109(3):1101-9. doi: 10.1121/1.1345696.