GN Hearing A/S, DK-2750 Ballerup, Denmark.
Audio Analysis Lab, AD:MT, Aalborg University, DK-9000 Aalborg, Denmark.
J Acoust Soc Am. 2018 Jun;143(6):3912. doi: 10.1121/1.5042222.
Speech localization and enhancement involves sound source mapping and reconstruction from noisy recordings of speech mixtures with microphone arrays. Conventional beamforming methods suffer from low resolution, especially with a limited number of microphones. In practice, there are only a few sources compared to the possible directions-of-arrival (DOA). Hence, DOA estimation is formulated as a sparse signal reconstruction problem and solved with sparse Bayesian learning (SBL). SBL uses a hierarchical two-level Bayesian inference to reconstruct sparse estimates from a small set of observations. The first level derives the posterior probability of the complex source amplitudes from the data likelihood and the prior. The second level tunes the prior towards sparse solutions with hyperparameters which maximize the evidence, i.e., the data probability. The adaptive learning of the hyperparameters from the data auto-regularizes the inference problem towards sparse robust estimates. Simulations and experimental data demonstrate that SBL beamforming provides high-resolution DOA maps outperforming traditional methods especially for correlated or non-stationary signals. Specifically for speech signals, the high-resolution SBL reconstruction offers not only speech enhancement but effectively speech separation.
语音定位和增强涉及到声源映射和重建,从带有麦克风阵列的语音混合噪声记录中进行。传统的波束形成方法分辨率较低,尤其是在麦克风数量有限的情况下。在实际中,与可能的到达方向(DOA)相比,声源数量较少。因此,DOA 估计被表述为稀疏信号重建问题,并通过稀疏贝叶斯学习(SBL)来解决。SBL 使用两级分层贝叶斯推理,从小数据集观测中重建稀疏估计值。第一级从数据似然和先验中得出复源幅度的后验概率。第二级通过调整超参数(最大化证据,即数据概率)使先验朝着稀疏解进行调整。超参数从数据中自适应学习会使推断问题朝着稀疏稳健估计进行自动正则化。模拟和实验数据表明,SBL 波束形成提供了高分辨率的 DOA 图,比传统方法表现更好,特别是对于相关或非平稳信号。对于语音信号,高分辨率 SBL 重建不仅提供了语音增强,还实现了有效的语音分离。