Suppr超能文献

复杂域中基于注意力的骨传导和声传导语音增强融合

ATTENTION-BASED FUSION FOR BONE-CONDUCTED AND AIR-CONDUCTED SPEECH ENHANCEMENT IN THE COMPLEX DOMAIN.

作者信息

Wang Heming, Zhang Xueliang, Wang DeLiang

机构信息

Department of Computer Science and Engineering, The Ohio State University, USA.

Department of Computer Science, Inner Mongolia University, China.

出版信息

Proc IEEE Int Conf Acoust Speech Signal Process. 2022 May;2022:7757-7761. doi: 10.1109/icassp43922.2022.9746374. Epub 2022 Apr 27.

Abstract

Bone-conduction (BC) microphones capture speech signals by converting the vibrations of the human skull into electrical signals. BC sensors are insensitive to acoustic noise, but limited in bandwidth. On the other hand, conventional or air-conduction (AC) microphones are capable of capturing full-band speech, but are susceptible to background noise. We propose to combine the strengths of AC and BC microphones by employing a convolutional recurrent network that performs complex spectral mapping. To better utilize signals from both kinds of microphone, we employ attention-based fusion with early-fusion and late-fusion strategies. Experiments demonstrate the superiority of the proposed method over other recent speech enhancement methods combining BC and AC signals. In addition, our enhancement performance is significantly better than conventional speech enhancement counterparts, especially in low signal-to-noise ratio scenarios.

摘要

骨传导(BC)麦克风通过将人类头骨的振动转换为电信号来捕捉语音信号。BC传感器对声学噪声不敏感,但带宽有限。另一方面,传统的或空气传导(AC)麦克风能够捕捉全频段语音,但容易受到背景噪声的影响。我们建议通过采用执行复杂频谱映射的卷积循环网络来结合AC和BC麦克风的优势。为了更好地利用来自两种麦克风的信号,我们采用基于注意力的融合,并结合早期融合和晚期融合策略。实验证明了所提出的方法优于其他最近结合BC和AC信号的语音增强方法。此外,我们增强后的性能明显优于传统的语音增强方法,特别是在低信噪比场景下。

相似文献

1
ATTENTION-BASED FUSION FOR BONE-CONDUCTED AND AIR-CONDUCTED SPEECH ENHANCEMENT IN THE COMPLEX DOMAIN.复杂域中基于注意力的骨传导和声传导语音增强融合
Proc IEEE Int Conf Acoust Speech Signal Process. 2022 May;2022:7757-7761. doi: 10.1109/icassp43922.2022.9746374. Epub 2022 Apr 27.
2
Fusing Bone-conduction and Air-conduction Sensors for Complex-Domain Speech Enhancement.融合骨传导与空气传导传感器用于复域语音增强
IEEE/ACM Trans Audio Speech Lang Process. 2022;30:3134-3143. doi: 10.1109/taslp.2022.3209943. Epub 2022 Sep 26.
7
Deep Learning Based Real-time Speech Enhancement for Dual-microphone Mobile Phones.基于深度学习的双麦克风手机实时语音增强
IEEE/ACM Trans Audio Speech Lang Process. 2021;29:1853-1863. doi: 10.1109/taslp.2021.3082318. Epub 2021 May 21.

本文引用的文献

1
A New Framework for CNN-Based Speech Enhancement in the Time Domain.基于卷积神经网络的时域语音增强新框架。
IEEE/ACM Trans Audio Speech Lang Process. 2019 Jul;27(7):1179-1188. doi: 10.1109/taslp.2019.2913512. Epub 2019 Apr 29.
2
Deep Learning Based Real-time Speech Enhancement for Dual-microphone Mobile Phones.基于深度学习的双麦克风手机实时语音增强
IEEE/ACM Trans Audio Speech Lang Process. 2021;29:1853-1863. doi: 10.1109/taslp.2021.3082318. Epub 2021 May 21.
5
Supervised Speech Separation Based on Deep Learning: An Overview.基于深度学习的监督语音分离:综述
IEEE/ACM Trans Audio Speech Lang Process. 2018 Oct;26(10):1702-1726. doi: 10.1109/TASLP.2018.2842159. Epub 2018 May 30.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验