Suppr超能文献

利用集成学习预测音乐音频中演奏动态的感知。

Predicting the perception of performed dynamics in music audio with ensemble learning.

作者信息

Elowsson Anders, Friberg Anders

机构信息

KTH Royal Institute of Technology, School of Computer Science and Communication, Speech, Music and Hearing, Stockholm, Sweden.

出版信息

J Acoust Soc Am. 2017 Mar;141(3):2224. doi: 10.1121/1.4978245.

Abstract

By varying the dynamics in a musical performance, the musician can convey structure and different expressions. Spectral properties of most musical instruments change in a complex way with the performed dynamics, but dedicated audio features for modeling the parameter are lacking. In this study, feature extraction methods were developed to capture relevant attributes related to spectral characteristics and spectral fluctuations, the latter through a sectional spectral flux. Previously, ground truths ratings of performed dynamics had been collected by asking listeners to rate how soft/loud the musicians played in a set of audio files. The ratings, averaged over subjects, were used to train three different machine learning models, using the audio features developed for the study as input. The highest result was produced from an ensemble of multilayer perceptrons with an R of 0.84. This result seems to be close to the upper bound, given the estimated uncertainty of the ground truth data. The result is well above that of individual human listeners of the previous listening experiment, and on par with the performance achieved from the average rating of six listeners. Features were analyzed with a factorial design, which highlighted the importance of source separation in the feature extraction.

摘要

通过改变音乐表演中的动态变化,音乐家能够传达结构和不同的表现力。大多数乐器的频谱特性会随着演奏的动态变化而以复杂的方式改变,但缺乏用于对该参数进行建模的专用音频特征。在本研究中,开发了特征提取方法来捕捉与频谱特征和频谱波动相关的相关属性,后者通过分段频谱通量来实现。此前,通过要求听众对一组音频文件中音乐家演奏的柔和/响亮程度进行评分,收集了演奏动态的真实评分。将受试者的评分进行平均,以本研究开发的音频特征作为输入,用于训练三种不同的机器学习模型。由多层感知器集成产生的最高结果的相关系数为0.84。考虑到真实数据的估计不确定性,这个结果似乎接近上限。该结果远高于之前听力实验中个体人类听众的结果,与六位听众的平均评分所取得的表现相当。采用析因设计对特征进行了分析,突出了源分离在特征提取中的重要性。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验