• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

复杂域中基于注意力的骨传导和声传导语音增强融合

ATTENTION-BASED FUSION FOR BONE-CONDUCTED AND AIR-CONDUCTED SPEECH ENHANCEMENT IN THE COMPLEX DOMAIN.

作者信息

Wang Heming, Zhang Xueliang, Wang DeLiang

机构信息

Department of Computer Science and Engineering, The Ohio State University, USA.

Department of Computer Science, Inner Mongolia University, China.

出版信息

Proc IEEE Int Conf Acoust Speech Signal Process. 2022 May;2022:7757-7761. doi: 10.1109/icassp43922.2022.9746374. Epub 2022 Apr 27.

DOI:10.1109/icassp43922.2022.9746374
PMID:40313328
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12045135/
Abstract

Bone-conduction (BC) microphones capture speech signals by converting the vibrations of the human skull into electrical signals. BC sensors are insensitive to acoustic noise, but limited in bandwidth. On the other hand, conventional or air-conduction (AC) microphones are capable of capturing full-band speech, but are susceptible to background noise. We propose to combine the strengths of AC and BC microphones by employing a convolutional recurrent network that performs complex spectral mapping. To better utilize signals from both kinds of microphone, we employ attention-based fusion with early-fusion and late-fusion strategies. Experiments demonstrate the superiority of the proposed method over other recent speech enhancement methods combining BC and AC signals. In addition, our enhancement performance is significantly better than conventional speech enhancement counterparts, especially in low signal-to-noise ratio scenarios.

摘要

骨传导(BC)麦克风通过将人类头骨的振动转换为电信号来捕捉语音信号。BC传感器对声学噪声不敏感,但带宽有限。另一方面,传统的或空气传导(AC)麦克风能够捕捉全频段语音,但容易受到背景噪声的影响。我们建议通过采用执行复杂频谱映射的卷积循环网络来结合AC和BC麦克风的优势。为了更好地利用来自两种麦克风的信号,我们采用基于注意力的融合,并结合早期融合和晚期融合策略。实验证明了所提出的方法优于其他最近结合BC和AC信号的语音增强方法。此外,我们增强后的性能明显优于传统的语音增强方法,特别是在低信噪比场景下。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d17/12045135/00337448711b/nihms-2076294-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d17/12045135/1e4e50cbb9da/nihms-2076294-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d17/12045135/afaeb883c7f4/nihms-2076294-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d17/12045135/4932248170eb/nihms-2076294-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d17/12045135/00337448711b/nihms-2076294-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d17/12045135/1e4e50cbb9da/nihms-2076294-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d17/12045135/afaeb883c7f4/nihms-2076294-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d17/12045135/4932248170eb/nihms-2076294-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d17/12045135/00337448711b/nihms-2076294-f0004.jpg

相似文献

1
ATTENTION-BASED FUSION FOR BONE-CONDUCTED AND AIR-CONDUCTED SPEECH ENHANCEMENT IN THE COMPLEX DOMAIN.复杂域中基于注意力的骨传导和声传导语音增强融合
Proc IEEE Int Conf Acoust Speech Signal Process. 2022 May;2022:7757-7761. doi: 10.1109/icassp43922.2022.9746374. Epub 2022 Apr 27.
2
Fusing Bone-conduction and Air-conduction Sensors for Complex-Domain Speech Enhancement.融合骨传导与空气传导传感器用于复域语音增强
IEEE/ACM Trans Audio Speech Lang Process. 2022;30:3134-3143. doi: 10.1109/taslp.2022.3209943. Epub 2022 Sep 26.
3
A Robust Dual-Microphone Generalized Sidelobe Canceller Using a Bone-Conduction Sensor for Speech Enhancement.使用骨传导传感器的稳健双麦克风广义旁瓣对消器用于语音增强。
Sensors (Basel). 2021 Mar 8;21(5):1878. doi: 10.3390/s21051878.
4
A lightweight speech enhancement network fusing bone- and air-conducted speech.融合骨导和气导语音的轻量级语音增强网络
J Acoust Soc Am. 2024 Aug 1;156(2):1355-1366. doi: 10.1121/10.0028339.
5
Bone-Conduction Sensor Assisted Noise Estimation for Improved Speech Enhancement.用于改进语音增强的骨传导传感器辅助噪声估计
Interspeech. 2018 Sep;2018:1180-1184. doi: 10.21437/interspeech.2018-1046.
6
A Real-Time Dual-Microphone Speech Enhancement Algorithm Assisted by Bone Conduction Sensor.骨传导传感器辅助的实时双麦克风语音增强算法。
Sensors (Basel). 2020 Sep 5;20(18):5050. doi: 10.3390/s20185050.
7
Deep Learning Based Real-time Speech Enhancement for Dual-microphone Mobile Phones.基于深度学习的双麦克风手机实时语音增强
IEEE/ACM Trans Audio Speech Lang Process. 2021;29:1853-1863. doi: 10.1109/taslp.2021.3082318. Epub 2021 May 21.
8
The effect of bone conduction microphone placement on intensity and spectrum of transmitted speech items.骨导传声器放置位置对言语传递项目强度和频谱的影响。
J Acoust Soc Am. 2013 Jun;133(6):3900-8. doi: 10.1121/1.4803870.
9
Regional Language Speech Recognition from Bone-Conducted Speech Signals through Different Deep Learning Architectures.基于不同深度学习架构的骨导语音信号的区域语言语音识别。
Comput Intell Neurosci. 2022 Aug 25;2022:4473952. doi: 10.1155/2022/4473952. eCollection 2022.
10
Model-based speech enhancement using a bone-conducted signal.基于模型的骨导信号语音增强。
J Acoust Soc Am. 2012 Mar;131(3):EL262-7. doi: 10.1121/1.3687014.

引用本文的文献

1
A Piezoelectric Micromachined Ultrasonic Transducer-Based Bone Conduction Microphone System for Enhancing Speech Recognition Accuracy.一种基于压电微机械超声换能器的骨传导麦克风系统,用于提高语音识别准确率。
Micromachines (Basel). 2025 May 23;16(6):613. doi: 10.3390/mi16060613.

本文引用的文献

1
A New Framework for CNN-Based Speech Enhancement in the Time Domain.基于卷积神经网络的时域语音增强新框架。
IEEE/ACM Trans Audio Speech Lang Process. 2019 Jul;27(7):1179-1188. doi: 10.1109/taslp.2019.2913512. Epub 2019 Apr 29.
2
Deep Learning Based Real-time Speech Enhancement for Dual-microphone Mobile Phones.基于深度学习的双麦克风手机实时语音增强
IEEE/ACM Trans Audio Speech Lang Process. 2021;29:1853-1863. doi: 10.1109/taslp.2021.3082318. Epub 2021 May 21.
3
Complex Spectral Mapping for Single- and Multi-Channel Speech Enhancement and Robust ASR.用于单通道和多通道语音增强及稳健自动语音识别的复杂谱映射
IEEE/ACM Trans Audio Speech Lang Process. 2020;28:1778-1787. doi: 10.1109/taslp.2020.2998279. Epub 2020 May 28.
4
Learning Complex Spectral Mapping with Gated Convolutional Recurrent Networks for Monaural Speech Enhancement.使用门控卷积递归网络学习复杂频谱映射以实现单声道语音增强
IEEE/ACM Trans Audio Speech Lang Process. 2020;28:380-390. doi: 10.1109/taslp.2019.2955276. Epub 2019 Nov 22.
5
Supervised Speech Separation Based on Deep Learning: An Overview.基于深度学习的监督语音分离:综述
IEEE/ACM Trans Audio Speech Lang Process. 2018 Oct;26(10):1702-1726. doi: 10.1109/TASLP.2018.2842159. Epub 2018 May 30.
6
In-ear microphone speech quality enhancement via adaptive filtering and artificial bandwidth extension.
J Acoust Soc Am. 2017 Mar;141(3):1321. doi: 10.1121/1.4976051.