短音乐片段的音色相似度建模

Modeling Timbre Similarity of Short Music Clips.

作者信息

Siedenburg Kai, Müllensiefen Daniel

机构信息

Department of Medical Physics and Acoustics, Carl von Ossietzky University of OldenburgOldenburg, Germany.

Department of Psychology, Goldsmiths University of LondonLondon, UK.

出版信息

Front Psychol. 2017 Apr 26;8:639. doi: 10.3389/fpsyg.2017.00639. eCollection 2017.

DOI:10.3389/fpsyg.2017.00639

PMID:28491045

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5405345/

Abstract

There is evidence from a number of recent studies that most listeners are able to extract information related to song identity, emotion, or genre from music excerpts with durations in the range of tenths of seconds. Because of these very short durations, timbre as a multifaceted auditory attribute appears as a plausible candidate for the type of features that listeners make use of when processing short music excerpts. However, the importance of timbre in listening tasks that involve short excerpts has not yet been demonstrated empirically. Hence, the goal of this study was to develop a method that allows to explore to what degree similarity judgments of short music clips can be modeled with low-level acoustic features related to timbre. We utilized the similarity data from two large samples of participants: Sample I was obtained via an online survey, used 16 clips of 400 ms length, and contained responses of 137,339 participants. Sample II was collected in a lab environment, used 16 clips of 800 ms length, and contained responses from 648 participants. Our model used two sets of audio features which included commonly used timbre descriptors and the well-known Mel-frequency cepstral coefficients as well as their temporal derivates. In order to predict pairwise similarities, the resulting distances between clips in terms of their audio features were used as predictor variables with partial least-squares regression. We found that a sparse selection of three to seven features from both descriptor sets-mainly encoding the coarse shape of the spectrum as well as spectrotemporal variability-best predicted similarities across the two sets of sounds. Notably, the inclusion of non-acoustic predictors of musical genre and record release date allowed much better generalization performance and explained up to 50% of shared variance () between observations and model predictions. Overall, the results of this study empirically demonstrate that both acoustic features related to timbre as well as higher level categorical features such as musical genre play a major role in the perception of short music clips.

摘要

最近的一些研究表明，大多数听众能够从时长在十分之几秒范围内的音乐片段中提取与歌曲身份、情感或流派相关的信息。由于这些时长非常短，音色作为一种多维度的听觉属性，似乎是听众在处理短音乐片段时所利用的特征类型的一个合理候选因素。然而，音色在涉及短片段的听力任务中的重要性尚未得到实证证明。因此，本研究的目的是开发一种方法，以探究短音乐片段的相似度判断在多大程度上可以用与音色相关的低层次声学特征来建模。我们利用了来自两个大样本参与者的相似度数据：样本I是通过在线调查获得的，使用了16个时长为400毫秒的片段，包含137339名参与者的回答。样本II是在实验室环境中收集的，使用了16个时长为800毫秒的片段，包含648名参与者的回答。我们的模型使用了两组音频特征，其中包括常用的音色描述符、著名的梅尔频率倒谱系数及其时间导数。为了预测成对相似度，将片段在音频特征方面的所得距离用作偏最小二乘回归的预测变量。我们发现，从两个描述符集中稀疏选择三到七个特征——主要编码频谱的粗略形状以及频谱时间变异性——能最好地预测两组声音之间的相似度。值得注意的是，纳入音乐流派和唱片发行日期的非声学预测变量能带来更好的泛化性能，并解释了观测值与模型预测之间高达50%的共享方差（）。总体而言，本研究的结果实证表明，与音色相关的声学特征以及诸如音乐流派等高层次分类特征在短音乐片段的感知中都起着重要作用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8df1/5405345/d2d6fe004a01/fpsyg-08-00639-g0001.jpg

相似文献

Modeling Timbre Similarity of Short Music Clips.

Front Psychol. 2017 Apr 26;8:639. doi: 10.3389/fpsyg.2017.00639. eCollection 2017.

Perception of musical timbre by cochlear implant listeners: a multidimensional scaling study.

Ear Hear. 2013 Jul-Aug;34(4):426-36. doi: 10.1097/AUD.0b013e31827535f8.

Differences in Perception of Musical Stimuli among Acoustic, Electric, and Combined Modality Listeners.

J Am Acad Audiol. 2015 May;26(5):494-501. doi: 10.3766/jaaa.14098.

Perception and Modeling of Affective Qualities of Musical Instrument Sounds across Pitch Registers.

Front Psychol. 2017 Feb 8;8:153. doi: 10.3389/fpsyg.2017.00153. eCollection 2017.

Learning metrics on spectrotemporal modulations reveals the perception of musical instrument timbre.

Nat Hum Behav. 2021 Mar;5(3):369-377. doi: 10.1038/s41562-020-00987-5. Epub 2020 Nov 30.

Music in our ears: the biological bases of musical timbre perception.

PLoS Comput Biol. 2012;8(11):e1002759. doi: 10.1371/journal.pcbi.1002759. Epub 2012 Nov 1.

Development of the adaptive music perception test.

Ear Hear. 2015 Mar-Apr;36(2):217-28. doi: 10.1097/AUD.0000000000000112.

The Timbre Toolbox: extracting audio descriptors from musical signals.

J Acoust Soc Am. 2011 Nov;130(5):2902-16. doi: 10.1121/1.3642604.

Neural correlates of musical timbre: an ALE meta-analysis of neuroimaging data.

Front Neurosci. 2024 Jun 17;18:1373232. doi: 10.3389/fnins.2024.1373232. eCollection 2024.

A Randomized Controlled Crossover Study of the Impact of Online Music Training on Pitch and Timbre Perception in Cochlear Implant Users.

J Assoc Res Otolaryngol. 2019 Jun;20(3):247-262. doi: 10.1007/s10162-018-00704-0. Epub 2019 Feb 27.

引用本文的文献

Interval and Ratio Scaling of Spectral Audio Descriptors.

Front Psychol. 2022 Mar 30;13:835401. doi: 10.3389/fpsyg.2022.835401. eCollection 2022.

本文引用的文献

Perception and Modeling of Affective Qualities of Musical Instrument Sounds across Pitch Registers.

Front Psychol. 2017 Feb 8;8:153. doi: 10.3389/fpsyg.2017.00153. eCollection 2017.

Acoustic and Categorical Dissimilarity of Musical Timbre: Evidence from Asymmetries Between Acoustic and Chimeric Sounds.

Front Psychol. 2016 Jan 5;6:1977. doi: 10.3389/fpsyg.2015.01977. eCollection 2015.

Comparison of Methods for Collecting and Modeling Dissimilarity Data: Applications to Complex Sound Stimuli.

Multivariate Behav Res. 2011 Sep 30;46(5):779-811. doi: 10.1080/00273171.2011.606748.

Investigating the importance of self-theories of intelligence and musicality for students' academic and musical achievement.

Front Psychol. 2015 Nov 5;6:1702. doi: 10.3389/fpsyg.2015.01702. eCollection 2015.

Auditory gist: recognition of very short sounds from timbre cues.

J Acoust Soc Am. 2014 Mar;135(3):1380-91. doi: 10.1121/1.4863659.

The musicality of non-musicians: an index for assessing musical sophistication in the general population.

PLoS One. 2014 Feb 26;9(2):e89642. doi: 10.1371/journal.pone.0089642. eCollection 2014.

Perceiving musical individuality: performer identification is dependent on performer expertise and expressiveness, but not on listener expertise.

Perception. 2011;40(10):1206-20. doi: 10.1068/p6891.

Large-scale brain networks emerge from dynamic processing of musical timbre, key and rhythm.

Neuroimage. 2012 Feb 15;59(4):3677-89. doi: 10.1016/j.neuroimage.2011.11.019. Epub 2011 Nov 12.

The Timbre Toolbox: extracting audio descriptors from musical signals.

J Acoust Soc Am. 2011 Nov;130(5):2902-16. doi: 10.1121/1.3642604.

Categorization of extremely brief auditory stimuli: domain-specific or domain-general processes?

PLoS One. 2011;6(10):e27024. doi: 10.1371/journal.pone.0027024. Epub 2011 Oct 27.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

短音乐片段的音色相似度建模

Modeling Timbre Similarity of Short Music Clips.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献