• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于卷积神经网络的音频信号分类技术的基准测试。

Benchmarking Audio Signal Representation Techniques for Classification with Convolutional Neural Networks.

机构信息

Australian Institute of Health Innovation, Macquarie University, Sydney, NSW 2109, Australia.

出版信息

Sensors (Basel). 2021 May 14;21(10):3434. doi: 10.3390/s21103434.

DOI:10.3390/s21103434
PMID:34069189
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8156023/
Abstract

Audio signal classification finds various applications in detecting and monitoring health conditions in healthcare. Convolutional neural networks (CNN) have produced state-of-the-art results in image classification and are being increasingly used in other tasks, including signal classification. However, audio signal classification using CNN presents various challenges. In image classification tasks, raw images of equal dimensions can be used as a direct input to CNN. Raw time-domain signals, on the other hand, can be of varying dimensions. In addition, the temporal signal often has to be transformed to frequency-domain to reveal unique spectral characteristics, therefore requiring signal transformation. In this work, we overview and benchmark various audio signal representation techniques for classification using CNN, including approaches that deal with signals of different lengths and combine multiple representations to improve the classification accuracy. Hence, this work surfaces important empirical evidence that may guide future works deploying CNN for audio signal classification purposes.

摘要

音频信号分类在医疗保健中用于检测和监测健康状况的各种应用中有着广泛的应用。卷积神经网络(CNN)在图像分类方面取得了最先进的成果,并越来越多地被用于包括信号分类在内的其他任务中。然而,使用 CNN 进行音频信号分类存在各种挑战。在图像分类任务中,可以将相同维度的原始图像直接作为 CNN 的输入。另一方面,原始时域信号的维度可能不同。此外,通常需要将时间域信号转换为频域以揭示独特的频谱特征,因此需要进行信号转换。在这项工作中,我们概述和基准测试了使用 CNN 进行分类的各种音频信号表示技术,包括处理不同长度信号的方法以及结合多种表示以提高分类准确性的方法。因此,这项工作提供了重要的经验证据,可能会为未来使用 CNN 进行音频信号分类的工作提供指导。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/de62/8156023/e7f78c8ce64e/sensors-21-03434-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/de62/8156023/0f6c382661cf/sensors-21-03434-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/de62/8156023/1ff5a5abe949/sensors-21-03434-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/de62/8156023/e7f78c8ce64e/sensors-21-03434-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/de62/8156023/0f6c382661cf/sensors-21-03434-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/de62/8156023/1ff5a5abe949/sensors-21-03434-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/de62/8156023/e7f78c8ce64e/sensors-21-03434-g003.jpg

相似文献

1
Benchmarking Audio Signal Representation Techniques for Classification with Convolutional Neural Networks.基于卷积神经网络的音频信号分类技术的基准测试。
Sensors (Basel). 2021 May 14;21(10):3434. doi: 10.3390/s21103434.
2
ECG-Derived Heart Rate Variability Interpolation and 1-D Convolutional Neural Networks for Detecting Sleep Apnea.基于心电图的心率变异性插值与一维卷积神经网络用于检测睡眠呼吸暂停
Annu Int Conf IEEE Eng Med Biol Soc. 2020 Jul;2020:637-640. doi: 10.1109/EMBC44109.2020.9175998.
3
EEG Signal Classification Using Convolutional Neural Networks on Combined Spatial and Temporal Dimensions for BCI Systems.基于时空联合维度的卷积神经网络在脑机接口系统中对脑电信号进行分类
Annu Int Conf IEEE Eng Med Biol Soc. 2020 Jul;2020:434-437. doi: 10.1109/EMBC44109.2020.9175894.
4
Orthogonal convolutional neural networks for automatic sleep stage classification based on single-channel EEG.基于单通道 EEG 的自动睡眠分期的正交卷积神经网络。
Comput Methods Programs Biomed. 2020 Jan;183:105089. doi: 10.1016/j.cmpb.2019.105089. Epub 2019 Sep 27.
5
Convolutional neural network analysis of recurrence plots for high resolution melting classification.卷积神经网络分析高分辨率熔解曲线图谱的复发模式用于分类。
Comput Methods Programs Biomed. 2021 Aug;207:106139. doi: 10.1016/j.cmpb.2021.106139. Epub 2021 May 5.
6
A Hybrid Time-Distributed Deep Neural Architecture for Speech Emotion Recognition.一种用于语音情感识别的混合时间分布深度神经架构。
Int J Neural Syst. 2022 Jun;32(6):2250024. doi: 10.1142/S0129065722500241. Epub 2022 May 12.
7
CNN-XGBoost fusion-based affective state recognition using EEG spectrogram image analysis.基于 CNN-XGBoost 融合的脑电频谱图图像分析情感状态识别。
Sci Rep. 2022 Aug 19;12(1):14122. doi: 10.1038/s41598-022-18257-x.
8
Voice Command Recognition Using Biologically Inspired Time-Frequency Representation and Convolutional Neural Networks.基于生物启发式时频表示和卷积神经网络的语音命令识别
Annu Int Conf IEEE Eng Med Biol Soc. 2020 Jul;2020:998-1001. doi: 10.1109/EMBC44109.2020.9176006.
9
fMRI volume classification using a 3D convolutional neural network robust to shifted and scaled neuronal activations.使用对移位和缩放神经元激活具有鲁棒性的 3D 卷积神经网络进行 fMRI 体积分类。
Neuroimage. 2020 Dec;223:117328. doi: 10.1016/j.neuroimage.2020.117328. Epub 2020 Sep 5.
10
Myoelectric Pattern Recognition Using Gramian Angular Field and Convolutional Neural Networks for Muscle-Computer Interface.基于 Gramian 角场和卷积神经网络的肌电模式识别在肌肉计算机接口中的应用。
Sensors (Basel). 2023 Mar 1;23(5):2715. doi: 10.3390/s23052715.

引用本文的文献

1
Experimental Investigation of Acoustic Features to Optimize Intelligibility in Cochlear Implants.实验研究优化人工耳蜗植入中可懂度的声学特征。
Sensors (Basel). 2023 Aug 31;23(17):7553. doi: 10.3390/s23177553.

本文引用的文献

1
A Comparative Survey of Feature Extraction and Machine Learning Methods in Diverse Acoustic Environments.不同声学环境下的特征提取与机器学习方法的比较研究。
Sensors (Basel). 2021 Feb 11;21(4):1274. doi: 10.3390/s21041274.
2
Voice Command Recognition Using Biologically Inspired Time-Frequency Representation and Convolutional Neural Networks.基于生物启发式时频表示和卷积神经网络的语音命令识别
Annu Int Conf IEEE Eng Med Biol Soc. 2020 Jul;2020:998-1001. doi: 10.1109/EMBC44109.2020.9176006.
3
A Weakly Supervised Learning Framework for Detecting Social Anxiety and Depression.
一种用于检测社交焦虑和抑郁的弱监督学习框架。
Proc ACM Interact Mob Wearable Ubiquitous Technol. 2018 Jun;2(2). doi: 10.1145/3214284.
4
Environment Sound Classification Using a Two-Stream CNN Based on Decision-Level Fusion.基于决策级融合的双流卷积神经网络环境声音分类
Sensors (Basel). 2019 Apr 11;19(7):1733. doi: 10.3390/s19071733.
5
Automatic Croup Diagnosis Using Cough Sound Recognition.基于咳嗽声识别的自动喉炎诊断。
IEEE Trans Biomed Eng. 2019 Feb;66(2):485-495. doi: 10.1109/TBME.2018.2849502. Epub 2018 Jun 21.
6
Deep learning.深度学习。
Nature. 2015 May 28;521(7553):436-44. doi: 10.1038/nature14539.
7
Content-based audio classification and retrieval by support vector machines.基于内容的音频分类与支持向量机检索
IEEE Trans Neural Netw. 2003;14(1):209-15. doi: 10.1109/TNN.2002.806626.
8
A cochlear frequency-position function for several species--29 years later.若干物种的耳蜗频率-位置函数——29年后
J Acoust Soc Am. 1990 Jun;87(6):2592-605. doi: 10.1121/1.399052.
9
Derivation of auditory filter shapes from notched-noise data.从带凹口噪声数据推导听觉滤波器形状。
Hear Res. 1990 Aug 1;47(1-2):103-38. doi: 10.1016/0378-5955(90)90170-t.