• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

短音乐片段的音色相似度建模

Modeling Timbre Similarity of Short Music Clips.

作者信息

Siedenburg Kai, Müllensiefen Daniel

机构信息

Department of Medical Physics and Acoustics, Carl von Ossietzky University of OldenburgOldenburg, Germany.

Department of Psychology, Goldsmiths University of LondonLondon, UK.

出版信息

Front Psychol. 2017 Apr 26;8:639. doi: 10.3389/fpsyg.2017.00639. eCollection 2017.

DOI:10.3389/fpsyg.2017.00639
PMID:28491045
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5405345/
Abstract

There is evidence from a number of recent studies that most listeners are able to extract information related to song identity, emotion, or genre from music excerpts with durations in the range of tenths of seconds. Because of these very short durations, timbre as a multifaceted auditory attribute appears as a plausible candidate for the type of features that listeners make use of when processing short music excerpts. However, the importance of timbre in listening tasks that involve short excerpts has not yet been demonstrated empirically. Hence, the goal of this study was to develop a method that allows to explore to what degree similarity judgments of short music clips can be modeled with low-level acoustic features related to timbre. We utilized the similarity data from two large samples of participants: Sample I was obtained via an online survey, used 16 clips of 400 ms length, and contained responses of 137,339 participants. Sample II was collected in a lab environment, used 16 clips of 800 ms length, and contained responses from 648 participants. Our model used two sets of audio features which included commonly used timbre descriptors and the well-known Mel-frequency cepstral coefficients as well as their temporal derivates. In order to predict pairwise similarities, the resulting distances between clips in terms of their audio features were used as predictor variables with partial least-squares regression. We found that a sparse selection of three to seven features from both descriptor sets-mainly encoding the coarse shape of the spectrum as well as spectrotemporal variability-best predicted similarities across the two sets of sounds. Notably, the inclusion of non-acoustic predictors of musical genre and record release date allowed much better generalization performance and explained up to 50% of shared variance () between observations and model predictions. Overall, the results of this study empirically demonstrate that both acoustic features related to timbre as well as higher level categorical features such as musical genre play a major role in the perception of short music clips.

摘要

最近的一些研究表明,大多数听众能够从时长在十分之几秒范围内的音乐片段中提取与歌曲身份、情感或流派相关的信息。由于这些时长非常短,音色作为一种多维度的听觉属性,似乎是听众在处理短音乐片段时所利用的特征类型的一个合理候选因素。然而,音色在涉及短片段的听力任务中的重要性尚未得到实证证明。因此,本研究的目的是开发一种方法,以探究短音乐片段的相似度判断在多大程度上可以用与音色相关的低层次声学特征来建模。我们利用了来自两个大样本参与者的相似度数据:样本I是通过在线调查获得的,使用了16个时长为400毫秒的片段,包含137339名参与者的回答。样本II是在实验室环境中收集的,使用了16个时长为800毫秒的片段,包含648名参与者的回答。我们的模型使用了两组音频特征,其中包括常用的音色描述符、著名的梅尔频率倒谱系数及其时间导数。为了预测成对相似度,将片段在音频特征方面的所得距离用作偏最小二乘回归的预测变量。我们发现,从两个描述符集中稀疏选择三到七个特征——主要编码频谱的粗略形状以及频谱时间变异性——能最好地预测两组声音之间的相似度。值得注意的是,纳入音乐流派和唱片发行日期的非声学预测变量能带来更好的泛化性能,并解释了观测值与模型预测之间高达50%的共享方差()。总体而言,本研究的结果实证表明,与音色相关的声学特征以及诸如音乐流派等高层次分类特征在短音乐片段的感知中都起着重要作用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8df1/5405345/e36cb2d6a7cf/fpsyg-08-00639-g0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8df1/5405345/d2d6fe004a01/fpsyg-08-00639-g0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8df1/5405345/6ded6e58eba5/fpsyg-08-00639-g0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8df1/5405345/30f529d4ed32/fpsyg-08-00639-g0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8df1/5405345/1be33f7ad34c/fpsyg-08-00639-g0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8df1/5405345/e36cb2d6a7cf/fpsyg-08-00639-g0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8df1/5405345/d2d6fe004a01/fpsyg-08-00639-g0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8df1/5405345/6ded6e58eba5/fpsyg-08-00639-g0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8df1/5405345/30f529d4ed32/fpsyg-08-00639-g0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8df1/5405345/1be33f7ad34c/fpsyg-08-00639-g0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8df1/5405345/e36cb2d6a7cf/fpsyg-08-00639-g0005.jpg

相似文献

1
Modeling Timbre Similarity of Short Music Clips.短音乐片段的音色相似度建模
Front Psychol. 2017 Apr 26;8:639. doi: 10.3389/fpsyg.2017.00639. eCollection 2017.
2
Perception of musical timbre by cochlear implant listeners: a multidimensional scaling study.人工耳蜗使用者对音乐音色的感知:多维标度研究。
Ear Hear. 2013 Jul-Aug;34(4):426-36. doi: 10.1097/AUD.0b013e31827535f8.
3
Differences in Perception of Musical Stimuli among Acoustic, Electric, and Combined Modality Listeners.声学、电子和组合模式聆听者对音乐刺激的感知差异。
J Am Acad Audiol. 2015 May;26(5):494-501. doi: 10.3766/jaaa.14098.
4
Perception and Modeling of Affective Qualities of Musical Instrument Sounds across Pitch Registers.跨音高范围对乐器声音情感特质的感知与建模
Front Psychol. 2017 Feb 8;8:153. doi: 10.3389/fpsyg.2017.00153. eCollection 2017.
5
Learning metrics on spectrotemporal modulations reveals the perception of musical instrument timbre.学习声谱时变调制的度量可以揭示对乐器音色的感知。
Nat Hum Behav. 2021 Mar;5(3):369-377. doi: 10.1038/s41562-020-00987-5. Epub 2020 Nov 30.
6
Music in our ears: the biological bases of musical timbre perception.音乐在我们耳边:音乐音色感知的生物学基础。
PLoS Comput Biol. 2012;8(11):e1002759. doi: 10.1371/journal.pcbi.1002759. Epub 2012 Nov 1.
7
Development of the adaptive music perception test.适应性音乐感知测试的开发。
Ear Hear. 2015 Mar-Apr;36(2):217-28. doi: 10.1097/AUD.0000000000000112.
8
The Timbre Toolbox: extracting audio descriptors from musical signals.音色工具箱:从音乐信号中提取音频描述符。
J Acoust Soc Am. 2011 Nov;130(5):2902-16. doi: 10.1121/1.3642604.
9
Neural correlates of musical timbre: an ALE meta-analysis of neuroimaging data.音乐音色的神经关联:神经影像数据的ALE元分析
Front Neurosci. 2024 Jun 17;18:1373232. doi: 10.3389/fnins.2024.1373232. eCollection 2024.
10
A Randomized Controlled Crossover Study of the Impact of Online Music Training on Pitch and Timbre Perception in Cochlear Implant Users.一项关于在线音乐训练对人工耳蜗使用者音高和音色感知影响的随机对照交叉研究。
J Assoc Res Otolaryngol. 2019 Jun;20(3):247-262. doi: 10.1007/s10162-018-00704-0. Epub 2019 Feb 27.

引用本文的文献

1
Interval and Ratio Scaling of Spectral Audio Descriptors.频谱音频描述符的区间和比率缩放
Front Psychol. 2022 Mar 30;13:835401. doi: 10.3389/fpsyg.2022.835401. eCollection 2022.

本文引用的文献

1
Perception and Modeling of Affective Qualities of Musical Instrument Sounds across Pitch Registers.跨音高范围对乐器声音情感特质的感知与建模
Front Psychol. 2017 Feb 8;8:153. doi: 10.3389/fpsyg.2017.00153. eCollection 2017.
2
Acoustic and Categorical Dissimilarity of Musical Timbre: Evidence from Asymmetries Between Acoustic and Chimeric Sounds.音乐音色的声学与类别差异:来自声学与嵌合声音不对称性的证据
Front Psychol. 2016 Jan 5;6:1977. doi: 10.3389/fpsyg.2015.01977. eCollection 2015.
3
Comparison of Methods for Collecting and Modeling Dissimilarity Data: Applications to Complex Sound Stimuli.
收集和建模差异数据的方法比较:在复杂声音刺激中的应用
Multivariate Behav Res. 2011 Sep 30;46(5):779-811. doi: 10.1080/00273171.2011.606748.
4
Investigating the importance of self-theories of intelligence and musicality for students' academic and musical achievement.探究智力与音乐能力的自我理论对学生学业及音乐成就的重要性。
Front Psychol. 2015 Nov 5;6:1702. doi: 10.3389/fpsyg.2015.01702. eCollection 2015.
5
Auditory gist: recognition of very short sounds from timbre cues.听觉主旨:从音色线索识别非常短的声音。
J Acoust Soc Am. 2014 Mar;135(3):1380-91. doi: 10.1121/1.4863659.
6
The musicality of non-musicians: an index for assessing musical sophistication in the general population.非音乐家的音乐性:一种评估普通人群音乐素养的指标。
PLoS One. 2014 Feb 26;9(2):e89642. doi: 10.1371/journal.pone.0089642. eCollection 2014.
7
Perceiving musical individuality: performer identification is dependent on performer expertise and expressiveness, but not on listener expertise.感知音乐个性:演奏者识别取决于演奏者的专业水平和表现力,而非听众的专业水平。
Perception. 2011;40(10):1206-20. doi: 10.1068/p6891.
8
Large-scale brain networks emerge from dynamic processing of musical timbre, key and rhythm.大规模的大脑网络源自于音乐音色、调式和节奏的动态处理。
Neuroimage. 2012 Feb 15;59(4):3677-89. doi: 10.1016/j.neuroimage.2011.11.019. Epub 2011 Nov 12.
9
The Timbre Toolbox: extracting audio descriptors from musical signals.音色工具箱:从音乐信号中提取音频描述符。
J Acoust Soc Am. 2011 Nov;130(5):2902-16. doi: 10.1121/1.3642604.
10
Categorization of extremely brief auditory stimuli: domain-specific or domain-general processes?极短暂听觉刺激的分类:是特定领域还是非特定领域的过程?
PLoS One. 2011;6(10):e27024. doi: 10.1371/journal.pone.0027024. Epub 2011 Oct 27.