Suppr超能文献

目标均衡抵消法与源分离定位法的比较。

Comparison of a target-equalization-cancellation approach and a localization approach to source separation.

作者信息

Mi Jing, Groll Matti, Colburn H Steven

机构信息

Hearing Research Center, Department of Biomedical Engineering, Boston University, 44 Cummington Mall, Boston, Massachusetts 02215, USA.

出版信息

J Acoust Soc Am. 2017 Nov;142(5):2933. doi: 10.1121/1.5009763.

Abstract

Interaural differences are important for listeners to be able to maintain focus on a sound source of interest in the presence of multiple sources. Because interaural differences are sound localization cues, most binaural-cue-based source separation algorithms attempt separation by localizing each time-frequency (T-F) unit to one of the possible source directions using interaural differences. By assembling T-F units that are assigned to one direction, the sound stream from that direction is enhanced. In this paper, a different type of binaural cue for source-separation purposes is proposed. For each T-F unit, the target-direction signal is cancelled by applying the equalization-cancellation (EC) operation to cancel the signal from the target direction; then, the dominance of the target in each T-F unit is determined by the effectiveness of the cancellation. Specifically, the energy change from cancellation is used as the criterion for target dominance for each T-F unit. Source-separation performance using the target-EC cue is compared with performance using localization cues. With simulated multi-talker and diffuse-babble interferers, the algorithm based on target-EC cues yields better source-separation performance than the algorithm based on localization cues, both in direct comparison with the ideal binary mask and in measured speech intelligibility for the separated target streams.

摘要

双耳差异对于听众在存在多个声源的情况下能够专注于感兴趣的声源非常重要。由于双耳差异是声音定位线索,大多数基于双耳线索的声源分离算法试图通过使用双耳差异将每个时频(T-F)单元定位到可能的声源方向之一来进行分离。通过组装分配到一个方向的T-F单元,来自该方向的声流得到增强。在本文中,提出了一种用于声源分离目的的不同类型的双耳线索。对于每个T-F单元,通过应用均衡抵消(EC)操作来抵消来自目标方向的信号,从而消除目标方向信号;然后,每个T-F单元中目标的主导性由抵消的有效性来确定。具体而言,抵消引起的能量变化被用作每个T-F单元中目标主导性的标准。将使用目标EC线索的声源分离性能与使用定位线索的性能进行比较。在模拟多说话者和漫反射干扰源的情况下,基于目标EC线索的算法在与理想二元掩码直接比较以及在分离目标流的测量语音可懂度方面,都比基于定位线索的算法产生更好的声源分离性能。

相似文献

2
A Binaural Grouping Model for Predicting Speech Intelligibility in Multitalker Environments.
Trends Hear. 2016 Oct 3;20:2331216516669919. doi: 10.1177/2331216516669919.
5
Coherent Coding of Enhanced Interaural Cues Improves Sound Localization in Noise With Bilateral Cochlear Implants.
Trends Hear. 2018 Jan-Dec;22:2331216518781746. doi: 10.1177/2331216518781746.
6
Role of Binaural Temporal Fine Structure and Envelope Cues in Cocktail-Party Listening.
J Neurosci. 2016 Aug 3;36(31):8250-7. doi: 10.1523/JNEUROSCI.4421-15.2016.
7
Speech segregation based on sound localization.
J Acoust Soc Am. 2003 Oct;114(4 Pt 1):2236-52. doi: 10.1121/1.1610463.
8
Blind localization and segregation of two sources including a binaural head movement model.
J Acoust Soc Am. 2017 Jul;142(1):EL113. doi: 10.1121/1.4986800.
10
Spectrotemporal window of binaural integration in auditory object formation.
Hear Res. 2018 Dec;370:155-167. doi: 10.1016/j.heares.2018.10.013. Epub 2018 Oct 19.

引用本文的文献

1
Binaural Recordings in Natural Acoustic Environments: Estimates of Speech-Likeness and Interaural Parameters.
Trends Hear. 2020 Jan-Dec;24:2331216520972858. doi: 10.1177/2331216520972858.

本文引用的文献

1
A Binaural Grouping Model for Predicting Speech Intelligibility in Multitalker Environments.
Trends Hear. 2016 Oct 3;20:2331216516669919. doi: 10.1177/2331216516669919.
2
Comparing Binaural Pre-processing Strategies I: Instrumental Evaluation.
Trends Hear. 2015 Dec 30;19:2331216515617916. doi: 10.1177/2331216515617916.
3
The cocktail-party problem revisited: early processing and selection of multi-talker speech.
Atten Percept Psychophys. 2015 Jul;77(5):1465-87. doi: 10.3758/s13414-015-0882-9.
7
Prediction of binaural speech intelligibility against noise in rooms.
J Acoust Soc Am. 2010 Jan;127(1):387-99. doi: 10.1121/1.3268612.
8
A place theory of sound localization.
J Comp Physiol Psychol. 1948 Feb;41(1):35-9. doi: 10.1037/h0061495.
9
Tuning in the spatial dimension: evidence from a masked speech identification task.
J Acoust Soc Am. 2008 Aug;124(2):1146-58. doi: 10.1121/1.2945710.
10
Binaural segregation in multisource reverberant environments.
J Acoust Soc Am. 2006 Dec;120(6):4040-51. doi: 10.1121/1.2355480.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验