Suppr超能文献

基于双通道余弦函数的稳健语音分离的 ITI 估计。

Dual-Channel Cosine Function Based ITD Estimation for Robust Speech Separation.

机构信息

Department of Electronic Engineering/Graduate School at Shenzhen, Tsinghua University, Beijing 100084, China.

出版信息

Sensors (Basel). 2017 Jun 20;17(6):1447. doi: 10.3390/s17061447.

Abstract

In speech separation tasks, many separation methods have the limitation that the microphones are closely spaced, which means that these methods are unprevailing for phase wrap-around. In this paper, we present a novel speech separation scheme by using two microphones that does not have this restriction. The technique utilizes the estimation of interaural time difference (ITD) statistics and binary time-frequency mask for the separation of mixed speech sources. The novelties of the paper consist in: (1) the extended application of delay-and-sum beamforming (DSB) and cosine function for ITD calculation; and (2) the clarification of the connection between ideal binary mask and DSB amplitude ratio. Our objective quality evaluation experiments demonstrate the effectiveness of the proposed method.

摘要

在语音分离任务中,许多分离方法都存在麦克风间距较近的限制,这意味着这些方法对于相位缠绕问题并不适用。在本文中,我们提出了一种新颖的语音分离方案,该方案使用两个麦克风,不存在这种限制。该技术利用了对耳间时间差(ITD)统计量和二进制时频掩蔽的估计来分离混合语音源。本文的新颖之处在于:(1)扩展了延迟求和波束形成(DSB)和余弦函数在 ITD 计算中的应用;(2)阐明了理想二进制掩蔽和 DSB 幅度比之间的关系。我们的客观质量评估实验证明了所提出方法的有效性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6469/5492097/49baccdc7fc7/sensors-17-01447-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验