• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于基于声谱图像的音乐流派分类的Rigdelet神经网络和改进的部分强化效应优化器。

Rigdelet neural network and improved partial reinforcement effect optimizer for music genre classification from sound spectrum images.

作者信息

Wang Fei, Fu Shuai, Abza Francis

机构信息

School of educational science, Jilin Normal College of Engineering Technology, Jilin, 130052, Jilin, China.

Changchun Humanities and Sciences College, ChangChun, 130117, JiLin, China.

出版信息

Heliyon. 2024 Jul 4;10(14):e34067. doi: 10.1016/j.heliyon.2024.e34067. eCollection 2024 Jul 30.

DOI:10.1016/j.heliyon.2024.e34067
PMID:39104510
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11298872/
Abstract

In this paper, a new approach has been introduced for classifying the music genres. The proposed approach involves transforming an audio signal into a unified representation known as a sound spectrum, from which texture features have been extracted using an enhanced Rigdelet Neural Network (RNN). Additionally, the RNN has been optimized using an improved version of the partial reinforcement effect optimizer (IPREO) that effectively avoids local optima and enhances the RNN's generalization capability. The GTZAN dataset has been utilized in experiments to assess the effectiveness of the proposed RNN/IPREO model for music genre classification. The results show an impressive accuracy of 92 % by incorporating a combination of spectral centroid, Mel-spectrogram, and Mel-frequency cepstral coefficients (MFCCs) as features. This performance significantly outperformed K-Means (58 %) and Support Vector Machines (up to 68 %). Furthermore, the RNN/IPREO model outshined various deep learning architectures such as Neural Networks (65 %), RNNs (84 %), CNNs (88 %), DNNs (86 %), VGG-16 (91 %), and ResNet-50 (90 %). It is worth noting that the RNN/IPREO model was able to achieve comparable results to well-known deep models like VGG-16, ResNet-50, and RNN-LSTM, sometimes even surpassing their scores. This highlights the strength of its hybrid CNN-Bi-directional RNN design in conjunction with the IPREO parameter optimization algorithm for extracting intricate and sequential auditory data.

摘要

本文介绍了一种用于音乐流派分类的新方法。所提出的方法包括将音频信号转换为一种统一的表示形式,即声谱,然后使用增强的Rigdelet神经网络(RNN)从中提取纹理特征。此外,RNN已使用部分强化效应优化器的改进版本(IPREO)进行了优化,该优化器有效避免了局部最优,并增强了RNN的泛化能力。实验中使用了GTZAN数据集来评估所提出的RNN/IPREO模型在音乐流派分类方面的有效性。结果表明,通过结合谱质心、梅尔频谱图和梅尔频率倒谱系数(MFCC)作为特征,准确率达到了令人印象深刻的92%。这一性能显著优于K均值算法(58%)和支持向量机(最高68%)。此外,RNN/IPREO模型优于各种深度学习架构,如神经网络(65%)、RNN(84%)、CNN(88%)、DNN(86%)、VGG-16(91%)和ResNet-50(90%)。值得注意的是,RNN/IPREO模型能够取得与VGG-16、ResNet-50和RNN-LSTM等知名深度模型相当的结果,有时甚至超过它们的分数。这突出了其混合CNN-双向RNN设计与IPREO参数优化算法相结合在提取复杂和序列听觉数据方面的优势。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/00e9/11298872/d771fce95d8d/gr7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/00e9/11298872/bca7138305b9/gr1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/00e9/11298872/38f2c190e20e/gr2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/00e9/11298872/8e4eb69fe0c2/gr3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/00e9/11298872/e818b3ed8c7d/gr4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/00e9/11298872/26ba91abc842/gr5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/00e9/11298872/793d055ca98e/gr6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/00e9/11298872/d771fce95d8d/gr7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/00e9/11298872/bca7138305b9/gr1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/00e9/11298872/38f2c190e20e/gr2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/00e9/11298872/8e4eb69fe0c2/gr3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/00e9/11298872/e818b3ed8c7d/gr4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/00e9/11298872/26ba91abc842/gr5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/00e9/11298872/793d055ca98e/gr6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/00e9/11298872/d771fce95d8d/gr7.jpg

相似文献

1
Rigdelet neural network and improved partial reinforcement effect optimizer for music genre classification from sound spectrum images.用于基于声谱图像的音乐流派分类的Rigdelet神经网络和改进的部分强化效应优化器。
Heliyon. 2024 Jul 4;10(14):e34067. doi: 10.1016/j.heliyon.2024.e34067. eCollection 2024 Jul 30.
2
Optimizing the configuration of deep learning models for music genre classification.优化用于音乐流派分类的深度学习模型配置。
Heliyon. 2024 Jan 17;10(2):e24892. doi: 10.1016/j.heliyon.2024.e24892. eCollection 2024 Jan 30.
3
A Multimodal Convolutional Neural Network Model for the Analysis of Music Genre on Children's Emotions Influence Intelligence.用于分析音乐类型对儿童情绪智力影响的多模态卷积神经网络模型。
Comput Intell Neurosci. 2022 Aug 29;2022:5611456. doi: 10.1155/2022/5611456. eCollection 2022.
4
The Effect of Signal Duration on the Classification of Heart Sounds: A Deep Learning Approach.信号时长对心音分类的影响:深度学习方法。
Sensors (Basel). 2022 Mar 15;22(6):2261. doi: 10.3390/s22062261.
5
COVID-19 Detection using Hybrid CNN-RNN Architecture with Transfer Learning from X-Rays.使用具有从X光进行迁移学习的混合CNN-RNN架构进行COVID-19检测
Curr Med Imaging. 2023 Aug 17. doi: 10.2174/1573405620666230817092337.
6
Deep learning in automatic detection of dysphonia: Comparing acoustic features and developing a generalizable framework.深度学习在嗓音障碍自动检测中的应用:比较声学特征并开发一个可推广的框架。
Int J Lang Commun Disord. 2023 Mar;58(2):279-294. doi: 10.1111/1460-6984.12783. Epub 2022 Sep 18.
7
Comparative Study of Popular Deep Learning Models for Machining Roughness Classification Using Sound and Force Signals.基于声音和力信号的常用深度学习模型用于加工粗糙度分类的比较研究
Micromachines (Basel). 2021 Nov 29;12(12):1484. doi: 10.3390/mi12121484.
8
Classification of benign and malignant subtypes of breast cancer histopathology imaging using hybrid CNN-LSTM based transfer learning.基于混合 CNN-LSTM 的迁移学习的乳腺癌组织病理学成像的良恶性亚型分类。
BMC Med Imaging. 2023 Jan 30;23(1):19. doi: 10.1186/s12880-023-00964-0.
9
Automated AJCC (7th edition) staging of non-small cell lung cancer (NSCLC) using deep convolutional neural network (CNN) and recurrent neural network (RNN).使用深度卷积神经网络(CNN)和循环神经网络(RNN)对非小细胞肺癌(NSCLC)进行自动AJCC(第7版)分期
Health Inf Sci Syst. 2019 Jul 30;7(1):14. doi: 10.1007/s13755-019-0077-1. eCollection 2019 Dec.
10
Music Similarity Detection Guided by Deep Learning Model.深度学习模型指导下的音乐相似度检测
Comput Intell Neurosci. 2023 Feb 20;2023:1263620. doi: 10.1155/2023/1263620. eCollection 2023.

引用本文的文献

1
An improved ViT model for music genre classification based on mel spectrogram.一种基于梅尔频谱图的用于音乐流派分类的改进型视觉Transformer(ViT)模型。
PLoS One. 2025 Mar 13;20(3):e0319027. doi: 10.1371/journal.pone.0319027. eCollection 2025.

本文引用的文献

1
Improved chaos grasshopper optimizer and its application to HRES techno-economic evaluation.改进的混沌蝗虫优化器及其在高分辨率能源系统技术经济评估中的应用。
Heliyon. 2024 Jan 12;10(2):e24315. doi: 10.1016/j.heliyon.2024.e24315. eCollection 2024 Jan 30.
2
SqueezeNet for the forecasting of the energy demand using a combined version of the sewing training-based optimization algorithm.使用基于缝纫训练的优化算法的组合版本的SqueezeNet用于能源需求预测。
Heliyon. 2023 Jun 3;9(6):e16827. doi: 10.1016/j.heliyon.2023.e16827. eCollection 2023 Jun.
3
Multivariate LSTM-FCNs for time series classification.
用于时间序列分类的多元 LSTM-FCNs。
Neural Netw. 2019 Aug;116:237-245. doi: 10.1016/j.neunet.2019.04.014. Epub 2019 May 4.