Suppr超能文献

基于信噪比的语音分离最优时频掩蔽比。

The optimal ratio time-frequency mask for speech separation in terms of the signal-to-noise ratio.

机构信息

National Laboratory of Pattern Recognition (NLPR), Institute of Automation, Chinese Academy of Sciences, Beijing, 100190, People's Republic of China

出版信息

J Acoust Soc Am. 2013 Nov;134(5):EL452-8. doi: 10.1121/1.4824632.

Abstract

In this paper, a computational goal for a monaural speech separation system is proposed. Since this goal is derived by maximizing the signal-to-noise ratio (SNR), it is called the optimal ratio mask (ORM). Under the approximate W-Disjoint Orthogonality assumption which almost always holds due to the sparse nature of speech, theoretical analysis shows that the ORM can improve the SNR about 10log(10)2 dB over the ideal ratio mask. With three kinds of real-world interference, the speech separation results of SNR gain and objective quality evaluation demonstrate the correctness of the theoretical analysis, and imply that the ORM achieves a better separation performance.

摘要

本文提出了一种用于单声道语音分离系统的计算目标。由于该目标是通过最大化信噪比(SNR)来推导的,因此称为最优比掩蔽(ORM)。在近似 W-不相交正交性假设下,由于语音的稀疏性,该假设几乎总是成立,理论分析表明,在理想比掩蔽的基础上,ORM 可以将 SNR 提高约 10log(10)2dB。通过三种真实世界的干扰,信噪比增益和客观质量评估的语音分离结果证明了理论分析的正确性,并表明 ORM 实现了更好的分离性能。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验