• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于双通道余弦函数的稳健语音分离的 ITI 估计。

Dual-Channel Cosine Function Based ITD Estimation for Robust Speech Separation.

机构信息

Department of Electronic Engineering/Graduate School at Shenzhen, Tsinghua University, Beijing 100084, China.

出版信息

Sensors (Basel). 2017 Jun 20;17(6):1447. doi: 10.3390/s17061447.

DOI:10.3390/s17061447
PMID:28632166
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5492097/
Abstract

In speech separation tasks, many separation methods have the limitation that the microphones are closely spaced, which means that these methods are unprevailing for phase wrap-around. In this paper, we present a novel speech separation scheme by using two microphones that does not have this restriction. The technique utilizes the estimation of interaural time difference (ITD) statistics and binary time-frequency mask for the separation of mixed speech sources. The novelties of the paper consist in: (1) the extended application of delay-and-sum beamforming (DSB) and cosine function for ITD calculation; and (2) the clarification of the connection between ideal binary mask and DSB amplitude ratio. Our objective quality evaluation experiments demonstrate the effectiveness of the proposed method.

摘要

在语音分离任务中,许多分离方法都存在麦克风间距较近的限制,这意味着这些方法对于相位缠绕问题并不适用。在本文中,我们提出了一种新颖的语音分离方案,该方案使用两个麦克风,不存在这种限制。该技术利用了对耳间时间差(ITD)统计量和二进制时频掩蔽的估计来分离混合语音源。本文的新颖之处在于:(1)扩展了延迟求和波束形成(DSB)和余弦函数在 ITD 计算中的应用;(2)阐明了理想二进制掩蔽和 DSB 幅度比之间的关系。我们的客观质量评估实验证明了所提出方法的有效性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6469/5492097/5fff6af3b835/sensors-17-01447-g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6469/5492097/49baccdc7fc7/sensors-17-01447-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6469/5492097/61c384f58045/sensors-17-01447-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6469/5492097/45ef15902cd1/sensors-17-01447-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6469/5492097/58429fc99fd4/sensors-17-01447-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6469/5492097/4b79bb3a0f93/sensors-17-01447-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6469/5492097/e42780d522ee/sensors-17-01447-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6469/5492097/d9dff0cc1fd3/sensors-17-01447-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6469/5492097/a3c2712803ff/sensors-17-01447-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6469/5492097/0783d2d60d88/sensors-17-01447-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6469/5492097/5fff6af3b835/sensors-17-01447-g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6469/5492097/49baccdc7fc7/sensors-17-01447-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6469/5492097/61c384f58045/sensors-17-01447-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6469/5492097/45ef15902cd1/sensors-17-01447-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6469/5492097/58429fc99fd4/sensors-17-01447-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6469/5492097/4b79bb3a0f93/sensors-17-01447-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6469/5492097/e42780d522ee/sensors-17-01447-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6469/5492097/d9dff0cc1fd3/sensors-17-01447-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6469/5492097/a3c2712803ff/sensors-17-01447-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6469/5492097/0783d2d60d88/sensors-17-01447-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6469/5492097/5fff6af3b835/sensors-17-01447-g010.jpg

相似文献

1
Dual-Channel Cosine Function Based ITD Estimation for Robust Speech Separation.基于双通道余弦函数的稳健语音分离的 ITI 估计。
Sensors (Basel). 2017 Jun 20;17(6):1447. doi: 10.3390/s17061447.
2
Two-microphone separation of speech mixtures.语音混合信号的双麦克风分离
IEEE Trans Neural Netw. 2008 Mar;19(3):475-92. doi: 10.1109/TNN.2007.911740.
3
Role of mask pattern in intelligibility of ideal binary-masked noisy speech.掩码模式在理想二元掩码噪声语音可懂度中的作用。
J Acoust Soc Am. 2009 Sep;126(3):1415-26. doi: 10.1121/1.3179673.
4
Time-delay estimation of reverberated speech exploiting harmonic structure.利用谐波结构估计混响语音的时延
J Acoust Soc Am. 1999 May;105(5):2914-9. doi: 10.1121/1.426904.
5
Intelligibility of reverberant noisy speech with ideal binary masking.用理想二值掩蔽评估混响噪声语音的可懂度。
J Acoust Soc Am. 2011 Oct;130(4):2153-61. doi: 10.1121/1.3631668.
6
Statistical analysis of the autoregressive modeling of reverberant speech.混响语音自回归建模的统计分析
J Acoust Soc Am. 2006 Dec;120(6):4031-9. doi: 10.1121/1.2356840.
7
Speech intelligibility in background noise with ideal binary time-frequency masking.基于理想二元时频掩蔽的背景噪声下语音清晰度
J Acoust Soc Am. 2009 Apr;125(4):2336-47. doi: 10.1121/1.3083233.
8
Perceptual effects of noise reduction by time-frequency masking of noisy speech.噪声语音的时频掩蔽降噪的感知效果。
J Acoust Soc Am. 2012 Oct;132(4):2690-9. doi: 10.1121/1.4747006.
9
Ideal time-frequency masking algorithms lead to different speech intelligibility and quality in normal-hearing and cochlear implant listeners.理想的时频掩蔽算法在正常听力和人工耳蜗听众中会导致不同的言语可懂度和质量。
IEEE Trans Biomed Eng. 2015 Jan;62(1):331-41. doi: 10.1109/TBME.2014.2351854. Epub 2014 Aug 26.
10
Principles and typical computational limitations of sparse speaker separation based on deterministic speech features.基于确定性语音特征的稀疏说话人分离的原理和典型计算限制。
Neural Comput. 2011 Sep;23(9):2358-89. doi: 10.1162/NECO_a_00165. Epub 2011 Jun 14.

引用本文的文献

1
Multi-TALK: Multi-Microphone Cross-Tower Network for Jointly Suppressing Acoustic Echo and Background Noise.多-TALK:用于联合抑制声回波和背景噪声的多麦克风跨塔网络。
Sensors (Basel). 2020 Nov 13;20(22):6493. doi: 10.3390/s20226493.

本文引用的文献

1
Acoustical Direction Finding with Time-Modulated Arrays.基于时间调制阵列的声学测向
Sensors (Basel). 2016 Dec 11;16(12):2107. doi: 10.3390/s16122107.
2
Single-channel blind separation using pseudo-stereo mixture and complex 2-D histogram.使用伪立体声混合和复 2-D 直方图的单通道盲分离。
IEEE Trans Neural Netw Learn Syst. 2013 Nov;24(11):1722-35. doi: 10.1109/TNNLS.2013.2258680.
3
Source localization with acoustic sensor arrays using generative model based fitting with sparse constraints.基于生成模型拟合和稀疏约束的声传感器阵列源定位。
Sensors (Basel). 2012 Oct 15;12(10):13781-812. doi: 10.3390/s121013781.
4
Two-microphone separation of speech mixtures.语音混合信号的双麦克风分离
IEEE Trans Neural Netw. 2008 Mar;19(3):475-92. doi: 10.1109/TNN.2007.911740.
5
Efficient variant of algorithm FastICA for independent component analysis attaining the Cramér-Rao lower bound.用于独立成分分析的算法FastICA的高效变体,达到克拉美-罗下界。
IEEE Trans Neural Netw. 2006 Sep;17(5):1265-77. doi: 10.1109/TNN.2006.875991.
6
Phase-based dual-microphone robust speech enhancement.基于相位的双麦克风鲁棒语音增强
IEEE Trans Syst Man Cybern B Cybern. 2004 Aug;34(4):1763-73. doi: 10.1109/tsmcb.2004.830345.