• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于枪声分类的深度频谱图学习:卷积神经网络架构与时频表示的比较研究

Deep Spectrogram Learning for Gunshot Classification: A Comparative Study of CNN Architectures and Time-Frequency Representations.

作者信息

Doungpaisan Pafan, Khunarsa Peerapol

机构信息

Faculty of Industrial Technology and Management, King Mongkut's University of Technology North Bangkok, Bangkok 10800, Thailand.

Faculty of Science and Technology, Uttaradit Rajabhat University, Uttaradit 53000, Thailand.

出版信息

J Imaging. 2025 Aug 21;11(8):281. doi: 10.3390/jimaging11080281.

DOI:10.3390/jimaging11080281
PMID:40863491
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12387842/
Abstract

Gunshot sound classification plays a crucial role in public safety, forensic investigations, and intelligent surveillance systems. This study evaluates the performance of deep learning models in classifying firearm sounds by analyzing twelve time-frequency spectrogram representations, including Mel, Bark, MFCC, CQT, Cochleagram, STFT, FFT, Reassigned, Chroma, Spectral Contrast, and Wavelet. The dataset consists of 2148 gunshot recordings from four firearm types, collected in a semi-controlled outdoor environment under multi-orientation conditions. To leverage advanced computer vision techniques, all spectrograms were converted into RGB images using perceptually informed colormaps. This enabled the application of image processing approaches and fine-tuning of pre-trained Convolutional Neural Networks (CNNs) originally developed for natural image classification. Six CNN architectures-ResNet18, ResNet50, ResNet101, GoogLeNet, Inception-v3, and InceptionResNetV2-were trained on these spectrogram images. Experimental results indicate that CQT, Cochleagram, and Mel spectrograms consistently achieved high classification accuracy, exceeding 94% when paired with deep CNNs such as ResNet101 and InceptionResNetV2. These findings demonstrate that transforming time-frequency features into RGB images not only facilitates the use of image-based processing but also allows deep models to capture rich spectral-temporal patterns, providing a robust framework for accurate firearm sound classification.

摘要

枪声分类在公共安全、法医调查和智能监控系统中起着至关重要的作用。本研究通过分析十二种时频谱表示,包括梅尔(Mel)、巴克(Bark)、梅尔频率倒谱系数(MFCC)、恒定Q变换(CQT)、耳蜗图(Cochleagram)、短时傅里叶变换(STFT)、快速傅里叶变换(FFT)、重分配谱、色度图、谱对比度和小波,评估深度学习模型对枪械声音进行分类的性能。该数据集由来自四种枪械类型的2148个枪声录音组成,是在半控制的户外环境中的多方向条件下收集的。为了利用先进的计算机视觉技术,所有谱图都使用感知信息色图转换为RGB图像。这使得能够应用图像处理方法,并对最初为自然图像分类而开发的预训练卷积神经网络(CNN)进行微调。六种CNN架构——残差网络18(ResNet18)、残差网络50(ResNet50)、残差网络101(ResNet101)、谷歌网络(GoogLeNet)、Inception-v3和InceptionResNetV2——在这些谱图图像上进行了训练。实验结果表明,CQT、耳蜗图和梅尔谱图始终实现了较高的分类准确率,与ResNet101和InceptionResNetV2等深度CNN搭配时超过了94%。这些发现表明,将时频特征转换为RGB图像不仅便于基于图像的处理的使用,还允许深度模型捕捉丰富的频谱-时间模式,为准确的枪械声音分类提供了一个强大的框架。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7134/12387842/dafcafc6d0e6/jimaging-11-00281-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7134/12387842/560b70b0e413/jimaging-11-00281-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7134/12387842/dafcafc6d0e6/jimaging-11-00281-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7134/12387842/560b70b0e413/jimaging-11-00281-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7134/12387842/dafcafc6d0e6/jimaging-11-00281-g002.jpg

相似文献

1
Deep Spectrogram Learning for Gunshot Classification: A Comparative Study of CNN Architectures and Time-Frequency Representations.用于枪声分类的深度频谱图学习:卷积神经网络架构与时频表示的比较研究
J Imaging. 2025 Aug 21;11(8):281. doi: 10.3390/jimaging11080281.
2
Bangla Speech Emotion Recognition Using Deep Learning-Based Ensemble Learning and Feature Fusion.基于深度学习的集成学习和特征融合的孟加拉语语音情感识别
J Imaging. 2025 Aug 14;11(8):273. doi: 10.3390/jimaging11080273.
3
Lightweight convolutional neural networks using nonlinear Lévy chaotic moth flame optimisation for brain tumour classification via efficient hyperparameter tuning.基于高效超参数调优的非线性 Lévy 混沌蛾火焰优化的轻量级卷积神经网络用于脑肿瘤分类
Sci Rep. 2025 Jul 2;15(1):22586. doi: 10.1038/s41598-025-02890-3.
4
Development and Validation of a Convolutional Neural Network Model to Predict a Pathologic Fracture in the Proximal Femur Using Abdomen and Pelvis CT Images of Patients With Advanced Cancer.利用晚期癌症患者腹部和骨盆 CT 图像建立卷积神经网络模型预测股骨近端病理性骨折的研究
Clin Orthop Relat Res. 2023 Nov 1;481(11):2247-2256. doi: 10.1097/CORR.0000000000002771. Epub 2023 Aug 23.
5
Evaluation of Deep Learning Methods for Pulmonary Disease Classification.用于肺病分类的深度学习方法评估
Curr Med Imaging. 2025;21:e15734056388107. doi: 10.2174/0115734056388107250710120917.
6
Taking a look at your speech: identifying diagnostic status and negative symptoms of psychosis using convolutional neural networks.审视你的言语:使用卷积神经网络识别精神病的诊断状态和阴性症状。
NPP Digit Psychiatry Neurosci. 2025;3(1):19. doi: 10.1038/s44277-025-00040-1. Epub 2025 Jul 8.
7
A deep learning approach to direct immunofluorescence pattern recognition in autoimmune bullous diseases.深度学习方法在自身免疫性大疱性疾病中的直接免疫荧光模式识别。
Br J Dermatol. 2024 Jul 16;191(2):261-266. doi: 10.1093/bjd/ljae142.
8
CXR-MultiTaskNet a unified deep learning framework for joint disease localization and classification in chest radiographs.CXR-MultiTaskNet:一种用于胸部X光片中疾病联合定位与分类的统一深度学习框架。
Sci Rep. 2025 Aug 31;15(1):32022. doi: 10.1038/s41598-025-16669-z.
9
Skin-CAD: Explainable deep learning classification of skin cancer from dermoscopic images by feature selection of dual high-level CNNs features and transfer learning.皮肤 CAD:基于双高级 CNN 特征选择和迁移学习的皮肤镜图像皮肤癌可解释深度学习分类。
Comput Biol Med. 2024 Aug;178:108798. doi: 10.1016/j.compbiomed.2024.108798. Epub 2024 Jun 25.
10
Develop intelligent waste bin prototype based on fusion feature recognition of sounds and RGB images.基于声音和RGB图像融合特征识别开发智能垃圾桶原型。
Waste Manag. 2025 Aug 1;204:114959. doi: 10.1016/j.wasman.2025.114959. Epub 2025 Jun 18.

本文引用的文献

1
Gun violence: a global problem in need of local solutions.枪支暴力:一个需要地方解决方案的全球性问题。
Lancet. 2024 Jun 29;403(10446):2783-2784. doi: 10.1016/S0140-6736(24)01123-1.
2
Deep Learning-based Identification of Brain MRI Sequences Using a Model Trained on Large Multicentric Study Cohorts.基于深度学习的利用大型多中心研究队列训练的模型对脑 MRI 序列进行识别。
Radiol Artif Intell. 2024 Jan;6(1):e230095. doi: 10.1148/ryai.230095.
3
Prediction of Primary Tumor Sites in Spinal Metastases Using a ResNet-50 Convolutional Neural Network Based on MRI.
基于MRI使用ResNet-50卷积神经网络预测脊柱转移瘤的原发肿瘤部位
Cancers (Basel). 2023 May 30;15(11):2974. doi: 10.3390/cancers15112974.
4
A multi-firearm, multi-orientation audio dataset of gunshots.一个包含多种枪支、多种射击方向的枪声音频数据集。
Data Brief. 2023 Mar 25;48:109091. doi: 10.1016/j.dib.2023.109091. eCollection 2023 Jun.
5
Stroke risk prediction by color Doppler ultrasound of carotid artery-based deep learning using Inception V3 and VGG-16.基于Inception V3和VGG-16深度学习的颈动脉彩色多普勒超声中风风险预测
Front Neurol. 2023 Feb 14;14:1111906. doi: 10.3389/fneur.2023.1111906. eCollection 2023.
6
Automated lesion detection of breast cancer in [F] FDG PET/CT using a novel AI-Based workflow.使用基于人工智能的新型工作流程在[F] FDG PET/CT中自动检测乳腺癌病变。
Front Oncol. 2022 Nov 15;12:1007874. doi: 10.3389/fonc.2022.1007874. eCollection 2022.
7
Deep learning in automatic detection of dysphonia: Comparing acoustic features and developing a generalizable framework.深度学习在嗓音障碍自动检测中的应用:比较声学特征并开发一个可推广的框架。
Int J Lang Commun Disord. 2023 Mar;58(2):279-294. doi: 10.1111/1460-6984.12783. Epub 2022 Sep 18.
8
A Novel Deep Learning Model to Distinguish Malignant Versus Benign Solid Lung Nodules.一种新型深度学习模型,用于区分恶性与良性肺部实性结节。
Med Sci Monit. 2022 Jul 29;28:e936830. doi: 10.12659/MSM.936830.
9
Classification of motor imagery EEG using deep learning increases performance in inefficient BCI users.深度学习对运动想象 EEG 的分类提高了低效率脑机接口用户的性能。
PLoS One. 2022 Jul 22;17(7):e0268880. doi: 10.1371/journal.pone.0268880. eCollection 2022.
10
Adversarial training for prostate cancer classification using magnetic resonance imaging.使用磁共振成像进行前列腺癌分类的对抗训练
Quant Imaging Med Surg. 2022 Jun;12(6):3276-3287. doi: 10.21037/qims-21-1089.