• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于残差网络和通道注意力模块的声源定位。

Sound source localization based on residual network and channel attention module.

机构信息

School of Naval Architecture, Ocean and Energy Power Engineering, Wuhan University of Technology, Wuhan, 430063, Hubei, China.

School of Computer Science and Artificial Intelligence, Wuhan Textile University, Wuhan, 430200, Hubei, China.

出版信息

Sci Rep. 2023 Apr 3;13(1):5443. doi: 10.1038/s41598-023-32657-7.

DOI:10.1038/s41598-023-32657-7
PMID:37012391
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10070247/
Abstract

This paper presents a sound source localization (SSL) model based on residual network and channel attention mechanism. The method takes the combination of log-Mel spectrogram and generalized cross-correlation phase transform (GCC-PHAT) as the input features, and extracts the time-frequency information by using the residual structure and channel attention mechanism, thus obtaining a better localizing performance. The residual blocks are introduced to extract deeper features, which can stack more layers for high-level features and avoid gradient vanishing or exploding at the same time. The attention mechanism is taken into account for the feature extraction stage in the proposed SSL model, which can focus on the most important information on the input features. We use the signals collected by microphone array to explore the performance of the model under different features, and find the most suitable input features of the proposed method. We compare our method with other models on public dataset. Experience results show a quite substantial improvement of sound source localizing performance.

摘要

本文提出了一种基于残差网络和通道注意力机制的声源定位(SSL)模型。该方法将对数梅尔频谱和广义互相关相位变换(GCC-PHAT)的组合作为输入特征,并通过残差结构和通道注意力机制提取时频信息,从而获得更好的定位性能。残差块用于提取更深层次的特征,这可以堆叠更多的层来获取更高层次的特征,同时避免梯度消失或爆炸。在提出的 SSL 模型中,注意力机制被用于特征提取阶段,这可以关注输入特征上最重要的信息。我们使用麦克风阵列收集的信号来探索模型在不同特征下的性能,并找到最适合该方法的输入特征。我们在公共数据集上与其他模型进行了比较。实验结果表明,声源定位性能有了相当大的提高。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ef71/10070247/fe33920622cc/41598_2023_32657_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ef71/10070247/f065194e4e09/41598_2023_32657_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ef71/10070247/a2a294940a76/41598_2023_32657_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ef71/10070247/f84d2bf91d48/41598_2023_32657_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ef71/10070247/860060ccfcba/41598_2023_32657_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ef71/10070247/6c490613ccbc/41598_2023_32657_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ef71/10070247/ed4d5c9e589d/41598_2023_32657_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ef71/10070247/fe33920622cc/41598_2023_32657_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ef71/10070247/f065194e4e09/41598_2023_32657_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ef71/10070247/a2a294940a76/41598_2023_32657_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ef71/10070247/f84d2bf91d48/41598_2023_32657_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ef71/10070247/860060ccfcba/41598_2023_32657_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ef71/10070247/6c490613ccbc/41598_2023_32657_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ef71/10070247/ed4d5c9e589d/41598_2023_32657_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ef71/10070247/fe33920622cc/41598_2023_32657_Fig7_HTML.jpg

相似文献

1
Sound source localization based on residual network and channel attention module.基于残差网络和通道注意力模块的声源定位。
Sci Rep. 2023 Apr 3;13(1):5443. doi: 10.1038/s41598-023-32657-7.
2
3D Multiple Sound Source Localization by Proposed T-Shaped Circular Distributed Microphone Arrays in Combination with GEVD and Adaptive GCC-PHAT/ML Algorithms.基于 T 型圆形分布式麦克风阵列与 GEVD 以及自适应 GCC-PHAT/ML 算法的三维多声源定位。
Sensors (Basel). 2022 Jan 28;22(3):1011. doi: 10.3390/s22031011.
3
Deep learning assisted sound source localization using two orthogonal first-order differential microphone arrays.使用两个正交一阶差分麦克风阵列的深度学习辅助声源定位
J Acoust Soc Am. 2021 Feb;149(2):1069. doi: 10.1121/10.0003445.
4
Polyphonic Sound Event Detection Using Temporal-Frequency Attention and Feature Space Attention.基于时频注意力和特征空间注意力的复音声音事件检测。
Sensors (Basel). 2022 Sep 9;22(18):6818. doi: 10.3390/s22186818.
5
Attention Based Convolutional Neural Network with Multi-frequency Resolution Feature for Environment Sound Classification.基于注意力机制的具有多频率分辨率特征的卷积神经网络用于环境声音分类
Neural Process Lett. 2022 Oct 24:1-16. doi: 10.1007/s11063-022-11041-y.
6
Sound Source Localization Based on Multi-Channel Cross-Correlation Weighted Beamforming.基于多通道互相关加权波束形成的声源定位
Micromachines (Basel). 2022 Jun 26;13(7):1010. doi: 10.3390/mi13071010.
7
A Study of Improved Two-Stage Dual-Conv Coordinate Attention Model for Sound Event Detection and Localization.用于声音事件检测与定位的改进型两阶段双卷积坐标注意力模型研究
Sensors (Basel). 2024 Aug 18;24(16):5336. doi: 10.3390/s24165336.
8
Fast Sound Source Localization Using Two-Level Search Space Clustering.基于两级搜索空间聚类的快速声源定位。
IEEE Trans Cybern. 2016 Jan;46(1):20-6. doi: 10.1109/TCYB.2015.2391252. Epub 2015 Feb 11.
9
Sound Source Distance Estimation Using Deep Learning: An Image Classification Approach.基于深度学习的声源距离估计:图像分类方法。
Sensors (Basel). 2019 Dec 27;20(1):172. doi: 10.3390/s20010172.
10
An Automatic Classification System for Environmental Sound in Smart Cities.智能城市中环境声音的自动分类系统
Sensors (Basel). 2023 Jul 31;23(15):6823. doi: 10.3390/s23156823.

引用本文的文献

1
Low-power Spiking Neural Network audio source localisation using a Hilbert Transform audio event encoding scheme.基于希尔伯特变换音频事件编码方案的低功耗脉冲神经网络音频源定位
Commun Eng. 2025 Feb 11;4(1):18. doi: 10.1038/s44172-025-00359-9.
2
Accelerating antimicrobial peptide design: Leveraging deep learning for rapid discovery.加速抗菌肽设计:利用深度学习实现快速发现
PLoS One. 2024 Dec 20;19(12):e0315477. doi: 10.1371/journal.pone.0315477. eCollection 2024.
3
A Survey of Sound Source Localization and Detection Methods and Their Applications.

本文引用的文献

1
A survey of sound source localization with deep learning methods.基于深度学习方法的声源定位研究
J Acoust Soc Am. 2022 Jul;152(1):107. doi: 10.1121/10.0011809.
声源定位与检测方法及其应用综述
Sensors (Basel). 2023 Dec 22;24(1):68. doi: 10.3390/s24010068.