Department of Electronic Engineering/Graduate School at Shenzhen, Tsinghua University, Beijing 100084, China.
Sensors (Basel). 2017 Jun 20;17(6):1447. doi: 10.3390/s17061447.
In speech separation tasks, many separation methods have the limitation that the microphones are closely spaced, which means that these methods are unprevailing for phase wrap-around. In this paper, we present a novel speech separation scheme by using two microphones that does not have this restriction. The technique utilizes the estimation of interaural time difference (ITD) statistics and binary time-frequency mask for the separation of mixed speech sources. The novelties of the paper consist in: (1) the extended application of delay-and-sum beamforming (DSB) and cosine function for ITD calculation; and (2) the clarification of the connection between ideal binary mask and DSB amplitude ratio. Our objective quality evaluation experiments demonstrate the effectiveness of the proposed method.
在语音分离任务中,许多分离方法都存在麦克风间距较近的限制,这意味着这些方法对于相位缠绕问题并不适用。在本文中,我们提出了一种新颖的语音分离方案,该方案使用两个麦克风,不存在这种限制。该技术利用了对耳间时间差(ITD)统计量和二进制时频掩蔽的估计来分离混合语音源。本文的新颖之处在于:(1)扩展了延迟求和波束形成(DSB)和余弦函数在 ITD 计算中的应用;(2)阐明了理想二进制掩蔽和 DSB 幅度比之间的关系。我们的客观质量评估实验证明了所提出方法的有效性。