基于正弦模型的谱特征提取的音乐情感识别：结合基于模型和深度学习的方法

Musical Emotion Recognition with Spectral Feature Extraction based on a Sinusoidal Model with Model-based and Deep-learning approaches.

作者信息

Xie Baijun, Kim Jonathan C, Park Chung Hyuk

机构信息

Department of Biomedical Engineering, The George Washington University, Washington, DC 20052, USA.

出版信息

Appl Sci (Basel). 2020 Feb;10(3). doi: 10.3390/app10030902. Epub 2020 Jan 30.

DOI:10.3390/app10030902

PMID:35582331

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9109831/

Abstract

This paper presents a method for extracting novel spectral features based on a sinusoidal model. The method is focused on characterizing the spectral shapes of audio signals using spectral peaks in frequency sub-bands. The extracted features are evaluated for predicting the levels of emotional dimensions, namely arousal and valence. Principal component regression, partial least squares regression, and deep convolutional neural network (CNN) models are used as prediction models for the levels of the emotional dimensions. The experimental results indicate that the proposed features include additional spectral information that common baseline features may not include. Since the quality of audio signals, especially timbre, plays a major role in affecting the perception of emotional valence in music, the inclusion of the presented features will contribute to decreasing the prediction error rate.

摘要

本文提出了一种基于正弦模型提取新颖频谱特征的方法。该方法专注于利用频率子带中的频谱峰值来表征音频信号的频谱形状。对提取的特征进行评估，以预测情感维度的水平，即唤醒度和效价。主成分回归、偏最小二乘回归和深度卷积神经网络（CNN）模型被用作情感维度水平的预测模型。实验结果表明，所提出的特征包含常见基线特征可能不包含的额外频谱信息。由于音频信号的质量，特别是音色，在影响音乐中情感效价的感知方面起着主要作用，因此包含所提出的特征将有助于降低预测错误率。

相似文献

Musical Emotion Recognition with Spectral Feature Extraction based on a Sinusoidal Model with Model-based and Deep-learning approaches.

Appl Sci (Basel). 2020 Feb;10(3). doi: 10.3390/app10030902. Epub 2020 Jan 30.

CNN-XGBoost fusion-based affective state recognition using EEG spectrogram image analysis.

Sci Rep. 2022 Aug 19;12(1):14122. doi: 10.1038/s41598-022-18257-x.

Perception and Modeling of Affective Qualities of Musical Instrument Sounds across Pitch Registers.

Front Psychol. 2017 Feb 8;8:153. doi: 10.3389/fpsyg.2017.00153. eCollection 2017.

Predicting the Arousal and Valence Values of Emotional States Using Learned, Predesigned, and Deep Visual Features.

Sensors (Basel). 2024 Jul 7;24(13):4398. doi: 10.3390/s24134398.

The multiscale 3D convolutional network for emotion recognition based on electroencephalogram.

Front Neurosci. 2022 Aug 15;16:872311. doi: 10.3389/fnins.2022.872311. eCollection 2022.

What a deep song: The role of music features in perceived depth.

Psych J. 2022 Oct;11(5):673-683. doi: 10.1002/pchj.510. Epub 2021 Dec 12.

Using machine learning analysis to interpret the relationship between music emotion and lyric features.

PeerJ Comput Sci. 2021 Nov 15;7:e785. doi: 10.7717/peerj-cs.785. eCollection 2021.

An Investigation of Deep Learning Models for EEG-Based Emotion Recognition.

Front Neurosci. 2020 Dec 23;14:622759. doi: 10.3389/fnins.2020.622759. eCollection 2020.

M1M2: Deep-Learning-Based Real-Time Emotion Recognition from Neural Activity.

Sensors (Basel). 2022 Nov 3;22(21):8467. doi: 10.3390/s22218467.

An improved multi-input deep convolutional neural network for automatic emotion recognition.

Front Neurosci. 2022 Oct 4;16:965871. doi: 10.3389/fnins.2022.965871. eCollection 2022.

引用本文的文献

Analytical approach to smart and sustainable city development with IoT.

Sci Rep. 2025 Jul 2;15(1):23617. doi: 10.1038/s41598-025-08861-y.

Transfer learning-based ensemble convolutional neural network for accelerated diagnosis of foot fractures.

Phys Eng Sci Med. 2023 Mar;46(1):265-277. doi: 10.1007/s13246-023-01215-w. Epub 2023 Jan 10.

Research on Chorus Emotion Recognition and Intelligent Medical Application Based on Health Big Data.

J Healthc Eng. 2022 Mar 9;2022:1363690. doi: 10.1155/2022/1363690. eCollection 2022.

Robust Multimodal Emotion Recognition from Conversation with Transformer-Based Crossmodality Fusion.

Sensors (Basel). 2021 Jul 19;21(14):4913. doi: 10.3390/s21144913.

本文引用的文献

Speech intelligibility estimation using multi-resolution spectral features for speakers undergoing cancer treatment.

J Acoust Soc Am. 2014 Oct;136(4):EL315-21. doi: 10.1121/1.4896410.

It's not what you play, it's how you play it: timbre affects perception of emotion in music.

Q J Exp Psychol (Hove). 2009 Nov;62(11):2141-55. doi: 10.1080/17470210902765957. Epub 2009 Apr 17.

Facial and vocal expressions of emotion.

Annu Rev Psychol. 2003;54:329-49. doi: 10.1146/annurev.psych.54.101601.145102. Epub 2002 Jun 10.

Pitch and rhythmic patterns affecting infants' sensitivity to musical phrase structure.

J Exp Psychol Hum Percept Perform. 1993 Jun;19(3):627-40. doi: 10.1037//0096-1523.19.3.627.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

基于正弦模型的谱特征提取的音乐情感识别：结合基于模型和深度学习的方法

Musical Emotion Recognition with Spectral Feature Extraction based on a Sinusoidal Model with Model-based and Deep-learning approaches.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献