Suppr超能文献

基于正弦模型的谱特征提取的音乐情感识别:结合基于模型和深度学习的方法

Musical Emotion Recognition with Spectral Feature Extraction based on a Sinusoidal Model with Model-based and Deep-learning approaches.

作者信息

Xie Baijun, Kim Jonathan C, Park Chung Hyuk

机构信息

Department of Biomedical Engineering, The George Washington University, Washington, DC 20052, USA.

出版信息

Appl Sci (Basel). 2020 Feb;10(3). doi: 10.3390/app10030902. Epub 2020 Jan 30.

Abstract

This paper presents a method for extracting novel spectral features based on a sinusoidal model. The method is focused on characterizing the spectral shapes of audio signals using spectral peaks in frequency sub-bands. The extracted features are evaluated for predicting the levels of emotional dimensions, namely arousal and valence. Principal component regression, partial least squares regression, and deep convolutional neural network (CNN) models are used as prediction models for the levels of the emotional dimensions. The experimental results indicate that the proposed features include additional spectral information that common baseline features may not include. Since the quality of audio signals, especially timbre, plays a major role in affecting the perception of emotional valence in music, the inclusion of the presented features will contribute to decreasing the prediction error rate.

摘要

本文提出了一种基于正弦模型提取新颖频谱特征的方法。该方法专注于利用频率子带中的频谱峰值来表征音频信号的频谱形状。对提取的特征进行评估,以预测情感维度的水平,即唤醒度和效价。主成分回归、偏最小二乘回归和深度卷积神经网络(CNN)模型被用作情感维度水平的预测模型。实验结果表明,所提出的特征包含常见基线特征可能不包含的额外频谱信息。由于音频信号的质量,特别是音色,在影响音乐中情感效价的感知方面起着主要作用,因此包含所提出的特征将有助于降低预测错误率。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验