相位估计对基于时频掩蔽的单通道语音分离的影响。

Impact of phase estimation on single-channel speech separation based on time-frequency masking.

作者信息

Mayer Florian, Williamson Donald S, Mowlaee Pejman, Wang DeLiang

机构信息

FH Joanneum - University of Applied Sciences, Graz, Austria.

Department of Computer Science, Indiana University, Bloomington, Indiana 47405, USA.

出版信息

J Acoust Soc Am. 2017 Jun;141(6):4668. doi: 10.1121/1.4986647.

DOI:10.1121/1.4986647

PMID:28679243

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6909979/

Abstract

Time-frequency masking is a common solution for the single-channel source separation (SCSS) problem where the goal is to find a time-frequency mask that separates the underlying sources from an observed mixture. An estimated mask is then applied to the mixed signal to extract the desired signal. During signal reconstruction, the time-frequency-masked spectral amplitude is combined with the mixture phase. This article considers the impact of replacing the mixture spectral phase with an estimated clean spectral phase combined with the estimated magnitude spectrum using a conventional model-based approach. As the proposed phase estimator requires estimated fundamental frequency of the underlying signal from the mixture, a robust pitch estimator is proposed. The upper-bound clean phase results show the potential of phase-aware processing in single-channel source separation. Also, the experiments demonstrate that replacing the mixture phase with the estimated clean spectral phase consistently improves perceptual speech quality, predicted speech intelligibility, and source separation performance across all signal-to-noise ratio and noise scenarios.

摘要

时频掩蔽是单通道源分离（SCSS）问题的一种常见解决方案，其目标是找到一个时频掩蔽，将潜在源从观测到的混合信号中分离出来。然后将估计出的掩蔽应用于混合信号，以提取所需信号。在信号重建过程中，时频掩蔽的频谱幅度与混合信号的相位相结合。本文考虑了使用基于传统模型的方法，用估计出的纯净频谱相位与估计出的幅度谱相结合来替代混合频谱相位的影响。由于所提出的相位估计器需要从混合信号中估计出潜在信号的基频，因此提出了一种鲁棒的基音估计器。纯净相位的上限结果显示了相位感知处理在单通道源分离中的潜力。此外，实验表明，在所有信噪比和噪声场景下，用估计出的纯净频谱相位替代混合信号相位均能持续提高感知语音质量、预测语音可懂度和源分离性能。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

相位估计对基于时频掩蔽的单通道语音分离的影响。

Impact of phase estimation on single-channel speech separation based on time-frequency masking.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

相位估计对基于时频掩蔽的单通道语音分离的影响。

Impact of phase estimation on single-channel speech separation based on time-frequency masking.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献