Suppr超能文献

智能城市中环境声音的自动分类系统

An Automatic Classification System for Environmental Sound in Smart Cities.

作者信息

Zhang Dongping, Zhong Ziyin, Xia Yuejian, Wang Zhutao, Xiong Wenbo

机构信息

Key Laboratory of Electromagnetic Wave Information Technology and Metrology of Zhejiang Province, China Jiliang University, Hangzhou 310018, China.

Hangzhou Aihua Intelligent Technology Co., Ltd., 359 Shuxin Road, Hangzhou 311100, China.

出版信息

Sensors (Basel). 2023 Jul 31;23(15):6823. doi: 10.3390/s23156823.

Abstract

With the continuous promotion of "smart cities" worldwide, the approach to be used in combining smart cities with modern advanced technologies (Internet of Things, cloud computing, artificial intelligence) has become a hot topic. However, due to the non-stationary nature of environmental sound and the interference of urban noise, it is challenging to fully extract features from the model with a single input and achieve ideal classification results, even with deep learning methods. To improve the recognition accuracy of ESC (environmental sound classification), we propose a dual-branch residual network (dual-resnet) based on feature fusion. Furthermore, in terms of data pre-processing, a loop-padding method is proposed to patch shorter data, enabling it to obtain more useful information. At the same time, in order to prevent the occurrence of overfitting, we use the time-frequency data enhancement method to expand the dataset. After uniform pre-processing of all the original audio, the dual-branch residual network automatically extracts the frequency domain features of the log-Mel spectrogram and log-spectrogram. Then, the two different audio features are fused to make the representation of the audio features more comprehensive. The experimental results show that compared with other models, the classification accuracy of the UrbanSound8k dataset has been improved to different degrees.

摘要

随着全球“智慧城市”的不断推进,将智慧城市与现代先进技术(物联网、云计算、人工智能)相结合的应用方式已成为热门话题。然而,由于环境声音的非平稳特性以及城市噪声的干扰,即使采用深度学习方法,仅通过单一输入从模型中完全提取特征并获得理想的分类结果也具有挑战性。为了提高环境声音分类(ESC)的识别准确率,我们提出了一种基于特征融合的双分支残差网络(dual - resnet)。此外,在数据预处理方面,提出了一种循环填充方法来补齐较短的数据,使其能够获得更多有用信息。同时,为了防止过拟合的发生,我们使用时频数据增强方法来扩充数据集。对所有原始音频进行统一预处理后,双分支残差网络自动提取对数梅尔频谱图和对数频谱图的频域特征。然后,将两种不同的音频特征进行融合,使音频特征的表示更加全面。实验结果表明,与其他模型相比,UrbanSound8k数据集的分类准确率有不同程度的提高。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fda0/10422509/be285f9f645d/sensors-23-06823-g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验