• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于信噪比的语音分离最优时频掩蔽比。

The optimal ratio time-frequency mask for speech separation in terms of the signal-to-noise ratio.

机构信息

National Laboratory of Pattern Recognition (NLPR), Institute of Automation, Chinese Academy of Sciences, Beijing, 100190, People's Republic of China

出版信息

J Acoust Soc Am. 2013 Nov;134(5):EL452-8. doi: 10.1121/1.4824632.

DOI:10.1121/1.4824632
PMID:24181990
Abstract

In this paper, a computational goal for a monaural speech separation system is proposed. Since this goal is derived by maximizing the signal-to-noise ratio (SNR), it is called the optimal ratio mask (ORM). Under the approximate W-Disjoint Orthogonality assumption which almost always holds due to the sparse nature of speech, theoretical analysis shows that the ORM can improve the SNR about 10log(10)2 dB over the ideal ratio mask. With three kinds of real-world interference, the speech separation results of SNR gain and objective quality evaluation demonstrate the correctness of the theoretical analysis, and imply that the ORM achieves a better separation performance.

摘要

本文提出了一种用于单声道语音分离系统的计算目标。由于该目标是通过最大化信噪比(SNR)来推导的,因此称为最优比掩蔽(ORM)。在近似 W-不相交正交性假设下,由于语音的稀疏性,该假设几乎总是成立,理论分析表明,在理想比掩蔽的基础上,ORM 可以将 SNR 提高约 10log(10)2dB。通过三种真实世界的干扰,信噪比增益和客观质量评估的语音分离结果证明了理论分析的正确性,并表明 ORM 实现了更好的分离性能。

相似文献

1
The optimal ratio time-frequency mask for speech separation in terms of the signal-to-noise ratio.基于信噪比的语音分离最优时频掩蔽比。
J Acoust Soc Am. 2013 Nov;134(5):EL452-8. doi: 10.1121/1.4824632.
2
Speech intelligibility in reverberation with ideal binary masking: effects of early reflections and signal-to-noise ratio threshold.混响环境下理想二值掩蔽对言语可懂度的影响:早期反射声和信噪比阈的作用。
J Acoust Soc Am. 2013 Mar;133(3):1707-17. doi: 10.1121/1.4789895.
3
Speech enhancement using empirical mode decomposition and the Teager-Kaiser energy operator.基于经验模态分解和Teager-Kaiser能量算子的语音增强
J Acoust Soc Am. 2014 Jan;135(1):451-9. doi: 10.1121/1.4837835.
4
Impact of phase estimation on single-channel speech separation based on time-frequency masking.相位估计对基于时频掩蔽的单通道语音分离的影响。
J Acoust Soc Am. 2017 Jun;141(6):4668. doi: 10.1121/1.4986647.
5
Perceptual effects of noise reduction by time-frequency masking of noisy speech.噪声语音的时频掩蔽降噪的感知效果。
J Acoust Soc Am. 2012 Oct;132(4):2690-9. doi: 10.1121/1.4747006.
6
The role of binary mask patterns in automatic speech recognition in background noise.二进制掩模模式在背景噪声中的自动语音识别中的作用。
J Acoust Soc Am. 2013 May;133(5):3083-93. doi: 10.1121/1.4798661.
7
A classification based approach to speech segregation.基于分类的语音分离方法。
J Acoust Soc Am. 2012 Nov;132(5):3475-83. doi: 10.1121/1.4754541.
8
Optimal subband Kalman filter for normal and oesophageal speech enhancement.
Biomed Mater Eng. 2014;24(6):3569-78. doi: 10.3233/BME-141183.
9
Long short-term memory for speaker generalization in supervised speech separation.用于监督语音分离中说话人泛化的长短期记忆网络
J Acoust Soc Am. 2017 Jun;141(6):4705. doi: 10.1121/1.4986931.
10
Speech enhancement using a structured codebook.基于结构码本的语音增强。
J Acoust Soc Am. 2012 Oct;132(4):EL329-35. doi: 10.1121/1.4751987.

引用本文的文献

1
Impact of Mask Type as Training Target for Speech Intelligibility and Quality in Cochlear-Implant Noise Reduction.口罩类型对人工耳蜗降噪言语可懂度和质量训练目标的影响。
Sensors (Basel). 2024 Oct 14;24(20):6614. doi: 10.3390/s24206614.
2
Supervised Speech Separation Based on Deep Learning: An Overview.基于深度学习的监督语音分离:综述
IEEE/ACM Trans Audio Speech Lang Process. 2018 Oct;26(10):1702-1726. doi: 10.1109/TASLP.2018.2842159. Epub 2018 May 30.