• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

智能城市中环境声音的自动分类系统

An Automatic Classification System for Environmental Sound in Smart Cities.

作者信息

Zhang Dongping, Zhong Ziyin, Xia Yuejian, Wang Zhutao, Xiong Wenbo

机构信息

Key Laboratory of Electromagnetic Wave Information Technology and Metrology of Zhejiang Province, China Jiliang University, Hangzhou 310018, China.

Hangzhou Aihua Intelligent Technology Co., Ltd., 359 Shuxin Road, Hangzhou 311100, China.

出版信息

Sensors (Basel). 2023 Jul 31;23(15):6823. doi: 10.3390/s23156823.

DOI:10.3390/s23156823
PMID:37571606
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10422509/
Abstract

With the continuous promotion of "smart cities" worldwide, the approach to be used in combining smart cities with modern advanced technologies (Internet of Things, cloud computing, artificial intelligence) has become a hot topic. However, due to the non-stationary nature of environmental sound and the interference of urban noise, it is challenging to fully extract features from the model with a single input and achieve ideal classification results, even with deep learning methods. To improve the recognition accuracy of ESC (environmental sound classification), we propose a dual-branch residual network (dual-resnet) based on feature fusion. Furthermore, in terms of data pre-processing, a loop-padding method is proposed to patch shorter data, enabling it to obtain more useful information. At the same time, in order to prevent the occurrence of overfitting, we use the time-frequency data enhancement method to expand the dataset. After uniform pre-processing of all the original audio, the dual-branch residual network automatically extracts the frequency domain features of the log-Mel spectrogram and log-spectrogram. Then, the two different audio features are fused to make the representation of the audio features more comprehensive. The experimental results show that compared with other models, the classification accuracy of the UrbanSound8k dataset has been improved to different degrees.

摘要

随着全球“智慧城市”的不断推进,将智慧城市与现代先进技术(物联网、云计算、人工智能)相结合的应用方式已成为热门话题。然而,由于环境声音的非平稳特性以及城市噪声的干扰,即使采用深度学习方法,仅通过单一输入从模型中完全提取特征并获得理想的分类结果也具有挑战性。为了提高环境声音分类(ESC)的识别准确率,我们提出了一种基于特征融合的双分支残差网络(dual - resnet)。此外,在数据预处理方面,提出了一种循环填充方法来补齐较短的数据,使其能够获得更多有用信息。同时,为了防止过拟合的发生,我们使用时频数据增强方法来扩充数据集。对所有原始音频进行统一预处理后,双分支残差网络自动提取对数梅尔频谱图和对数频谱图的频域特征。然后,将两种不同的音频特征进行融合,使音频特征的表示更加全面。实验结果表明,与其他模型相比,UrbanSound8k数据集的分类准确率有不同程度的提高。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fda0/10422509/a716184f3f57/sensors-23-06823-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fda0/10422509/be285f9f645d/sensors-23-06823-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fda0/10422509/492159ecf3d2/sensors-23-06823-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fda0/10422509/a579b57baab7/sensors-23-06823-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fda0/10422509/53851185e4c7/sensors-23-06823-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fda0/10422509/80879e43424f/sensors-23-06823-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fda0/10422509/a716184f3f57/sensors-23-06823-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fda0/10422509/be285f9f645d/sensors-23-06823-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fda0/10422509/492159ecf3d2/sensors-23-06823-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fda0/10422509/a579b57baab7/sensors-23-06823-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fda0/10422509/53851185e4c7/sensors-23-06823-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fda0/10422509/80879e43424f/sensors-23-06823-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fda0/10422509/a716184f3f57/sensors-23-06823-g006.jpg

相似文献

1
An Automatic Classification System for Environmental Sound in Smart Cities.智能城市中环境声音的自动分类系统
Sensors (Basel). 2023 Jul 31;23(15):6823. doi: 10.3390/s23156823.
2
Environment Sound Classification Using a Two-Stream CNN Based on Decision-Level Fusion.基于决策级融合的双流卷积神经网络环境声音分类
Sensors (Basel). 2019 Apr 11;19(7):1733. doi: 10.3390/s19071733.
3
DCNN for Pig Vocalization and Non-Vocalization Classification: Evaluate Model Robustness with New Data.用于猪发声与非发声分类的深度卷积神经网络:使用新数据评估模型稳健性
Animals (Basel). 2024 Jul 9;14(14):2029. doi: 10.3390/ani14142029.
4
Evaluating the Performance of Pre-Trained Convolutional Neural Network for Audio Classification on Embedded Systems for Anomaly Detection in Smart Cities.评估预训练卷积神经网络在嵌入式系统上进行音频分类的性能,以实现智能城市中的异常检测。
Sensors (Basel). 2023 Jul 7;23(13):6227. doi: 10.3390/s23136227.
5
Environmental sound classification using temporal-frequency attention based convolutional neural network.基于时频注意力的卷积神经网络的环境声音分类。
Sci Rep. 2021 Nov 3;11(1):21552. doi: 10.1038/s41598-021-01045-4.
6
Sound Classification and Processing of Urban Environments: A Systematic Literature Review.城市环境中的声音分类与处理:系统文献综述。
Sensors (Basel). 2022 Nov 8;22(22):8608. doi: 10.3390/s22228608.
7
A Music Emotion Classification Model Based on the Improved Convolutional Neural Network.基于改进卷积神经网络的音乐情绪分类模型。
Comput Intell Neurosci. 2022 Feb 14;2022:6749622. doi: 10.1155/2022/6749622. eCollection 2022.
8
Feature-Based Fusion Using CNN for Lung and Heart Sound Classification.基于特征融合的 CNN 用于心肺音分类。
Sensors (Basel). 2022 Feb 16;22(4):1521. doi: 10.3390/s22041521.
9
Attention Based Convolutional Neural Network with Multi-frequency Resolution Feature for Environment Sound Classification.基于注意力机制的具有多频率分辨率特征的卷积神经网络用于环境声音分类
Neural Process Lett. 2022 Oct 24:1-16. doi: 10.1007/s11063-022-11041-y.
10
High Accurate Environmental Sound Classification: Sub-Spectrogram Segmentation versus Temporal-Frequency Attention Mechanism.高精度环境声音分类:子频谱分段与时频注意力机制。
Sensors (Basel). 2021 Aug 16;21(16):5500. doi: 10.3390/s21165500.

引用本文的文献

1
Hierarchical-Concatenate Fusion TDNN for sound event classification.基于层次连接融合的 TDNN 用于声音事件分类。
PLoS One. 2024 Oct 31;19(10):e0312998. doi: 10.1371/journal.pone.0312998. eCollection 2024.

本文引用的文献

1
AI Empowered Virtual Reality Integrated Systems for Sleep Stage Classification and Quality Enhancement.人工智能赋能的虚拟现实集成系统用于睡眠阶段分类和质量提升。
IEEE Trans Neural Syst Rehabil Eng. 2022;30:1494-1503. doi: 10.1109/TNSRE.2022.3178476. Epub 2022 Jun 13.
2
Environmental sound classification using temporal-frequency attention based convolutional neural network.基于时频注意力的卷积神经网络的环境声音分类。
Sci Rep. 2021 Nov 3;11(1):21552. doi: 10.1038/s41598-021-01045-4.