Boston University, Boston, MA, USA
Boston University, Boston, MA, USA.
Trends Hear. 2016 Oct 3;20:2331216516669919. doi: 10.1177/2331216516669919.
Spatially separating speech maskers from target speech often leads to a large intelligibility improvement. Modeling this phenomenon has long been of interest to binaural-hearing researchers for uncovering brain mechanisms and for improving signal-processing algorithms in hearing-assistive devices. Much of the previous binaural modeling work focused on the unmasking enabled by binaural cues at the periphery, and little quantitative modeling has been directed toward the grouping or source-separation benefits of binaural processing. In this article, we propose a binaural model that focuses on grouping, specifically on the selection of time-frequency units that are dominated by signals from the direction of the target. The proposed model uses Equalization-Cancellation (EC) processing with a binary decision rule to estimate a time-frequency binary mask. EC processing is carried out to cancel the target signal and the energy change between the EC input and output is used as a feature that reflects target dominance in each time-frequency unit. The processing in the proposed model requires little computational resources and is straightforward to implement. In combination with the Coherence-based Speech Intelligibility Index, the model is applied to predict the speech intelligibility data measured by Marrone et al. The predicted speech reception threshold matches the pattern of the measured data well, even though the predicted intelligibility improvements relative to the colocated condition are larger than some of the measured data, which may reflect the lack of internal noise in this initial version of the model.
空间上分离语音掩蔽和目标语音通常会导致可懂度的大幅提高。这种现象的建模一直是双耳听力研究人员感兴趣的问题,目的是揭示大脑机制,并改进听力辅助设备中的信号处理算法。之前的许多双耳建模工作都集中在外周双耳线索所实现的掩蔽上,而很少有定量建模针对双耳处理的分组或声源分离优势。在本文中,我们提出了一种专注于分组的双耳模型,特别是针对选择由目标方向信号主导的时频单元。所提出的模型使用均衡-抵消(EC)处理和二进制决策规则来估计时频二进制掩蔽。EC 处理用于抵消目标信号,EC 输入和输出之间的能量变化被用作反映每个时频单元中目标主导性的特征。该模型所需的计算资源很少,易于实现。与基于相干性的语音可懂度指数相结合,该模型被应用于预测 Marrone 等人测量的语音可懂度数据。预测的语音接收阈值与测量数据的模式非常吻合,尽管相对于同置条件的预测可懂度提高幅度大于一些测量数据,这可能反映了模型初始版本中缺乏内部噪声。