• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种基于Transformer的方法,用于MIDI歌曲和音轨的细粒度和粗粒度分类及生成。

A transformers-based approach for fine and coarse-grained classification and generation of MIDI songs and soundtracks.

作者信息

Angioni Simone, Lincoln-DeCusatis Nathan, Ibba Andrea, Reforgiato Recupero Diego

机构信息

Department of Mathematics and Computer Science, University of Cagliari, Cagliari, Sardegna, Italy.

Department of Music, Fordham University, New York, United States of America.

出版信息

PeerJ Comput Sci. 2023 Jun 19;9:e1410. doi: 10.7717/peerj-cs.1410. eCollection 2023.

DOI:10.7717/peerj-cs.1410
PMID:37409082
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10319258/
Abstract

Music is an extremely subjective art form whose commodification the recording industry in the 20th century has led to an increasingly subdivided set of genre labels that attempt to organize musical styles into definite categories. Music psychology has been studying the processes through which music is perceived, created, responded to, and incorporated into everyday life, and, modern artificial intelligence technology can be exploited in such a direction. Music classification and generation are emerging fields that gained much attention recently, especially with the latest discoveries within deep learning technologies. Self attention networks have in fact brought huge benefits for several tasks of classification and generation in different domains where data of different types were used (text, images, videos, sounds). In this article, we want to analyze the effectiveness of Transformers for both classification and generation tasks and study the performances of classification at different granularity and of generation using different human and automatic metrics. The input data consist of MIDI sounds that we have considered from different datasets: sounds from 397 Nintendo Entertainment System video games, classical pieces, and rock songs from different composers and bands. We have performed classification tasks within each dataset to identify the types or composers of each sample (fine-grained) and classification at a higher level. In the latter, we combined the three datasets together with the goal of identifying for each sample just NES, rock, or classical (coarse-grained) pieces. The proposed transformers-based approach outperformed competitors based on deep learning and machine learning approaches. Finally, the generation task has been carried out on each dataset and the resulting samples have been evaluated using human and automatic metrics (the local alignment).

摘要

音乐是一种极具主观性的艺术形式,20世纪唱片业对其进行商品化,导致了一系列日益细分的流派标签,这些标签试图将音乐风格组织成明确的类别。音乐心理学一直在研究音乐被感知、创作、回应以及融入日常生活的过程,并且,现代人工智能技术可以朝着这个方向加以利用。音乐分类和生成是新兴领域,最近受到了广泛关注,尤其是随着深度学习技术的最新发现。事实上,自注意力网络在不同领域(文本、图像、视频、声音)使用不同类型数据的分类和生成的多项任务中都带来了巨大益处。在本文中,我们想要分析Transformer在分类和生成任务方面的有效性,并研究不同粒度下的分类性能以及使用不同人工和自动指标的生成性能。输入数据由我们从不同数据集中获取的MIDI声音组成:来自397款任天堂娱乐系统视频游戏的声音、古典乐曲以及来自不同作曲家和乐队的摇滚歌曲。我们在每个数据集中执行了分类任务,以识别每个样本的类型或作曲家(细粒度)以及更高层次的分类。在后者中,我们将这三个数据集组合在一起,目标是仅识别每个样本属于任天堂娱乐系统、摇滚还是古典(粗粒度)作品。所提出的基于Transformer的方法优于基于深度学习和机器学习方法的竞争对手。最后,在每个数据集上执行了生成任务,并使用人工和自动指标(局部对齐)对生成的样本进行了评估。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3a16/10319258/aa829a6a75a5/peerj-cs-09-1410-g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3a16/10319258/a3b89daac7ad/peerj-cs-09-1410-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3a16/10319258/c347a2c887a8/peerj-cs-09-1410-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3a16/10319258/b2d416a01594/peerj-cs-09-1410-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3a16/10319258/ba38bdef485b/peerj-cs-09-1410-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3a16/10319258/ec64264ae620/peerj-cs-09-1410-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3a16/10319258/e70460b9edae/peerj-cs-09-1410-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3a16/10319258/961a95a820cf/peerj-cs-09-1410-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3a16/10319258/08baf68828c0/peerj-cs-09-1410-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3a16/10319258/9ab20526e9b3/peerj-cs-09-1410-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3a16/10319258/aa829a6a75a5/peerj-cs-09-1410-g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3a16/10319258/a3b89daac7ad/peerj-cs-09-1410-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3a16/10319258/c347a2c887a8/peerj-cs-09-1410-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3a16/10319258/b2d416a01594/peerj-cs-09-1410-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3a16/10319258/ba38bdef485b/peerj-cs-09-1410-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3a16/10319258/ec64264ae620/peerj-cs-09-1410-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3a16/10319258/e70460b9edae/peerj-cs-09-1410-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3a16/10319258/961a95a820cf/peerj-cs-09-1410-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3a16/10319258/08baf68828c0/peerj-cs-09-1410-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3a16/10319258/9ab20526e9b3/peerj-cs-09-1410-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3a16/10319258/aa829a6a75a5/peerj-cs-09-1410-g010.jpg

相似文献

1
A transformers-based approach for fine and coarse-grained classification and generation of MIDI songs and soundtracks.一种基于Transformer的方法,用于MIDI歌曲和音轨的细粒度和粗粒度分类及生成。
PeerJ Comput Sci. 2023 Jun 19;9:e1410. doi: 10.7717/peerj-cs.1410. eCollection 2023.
2
A Lightweight Deep Learning-Based Approach for Jazz Music Generation in MIDI Format.一种基于轻量级深度学习的 MIDI 格式爵士音乐生成方法。
Comput Intell Neurosci. 2022 Aug 5;2022:2140895. doi: 10.1155/2022/2140895. eCollection 2022.
3
The Classification of Music and Art Genres under the Visual Threshold of Deep Learning.深度学习视阈下的音乐艺术类型分类
Comput Intell Neurosci. 2022 May 18;2022:4439738. doi: 10.1155/2022/4439738. eCollection 2022.
4
Creating musical features using multi-faceted, multi-task encoders based on transformers.基于转换器的多方面、多任务编码器创建音乐特征。
Sci Rep. 2023 Jul 3;13(1):10713. doi: 10.1038/s41598-023-36714-z.
5
NLP-based music processing for composer classification.基于自然语言处理的作曲风格分类研究
Sci Rep. 2023 Aug 14;13(1):13228. doi: 10.1038/s41598-023-40332-0.
6
Transformers-sklearn: a toolkit for medical language understanding with transformer-based models.Transformer-sklearn:一个基于 Transformer 的模型的医学语言理解工具包。
BMC Med Inform Decis Mak. 2021 Jul 30;21(Suppl 2):90. doi: 10.1186/s12911-021-01459-0.
7
Classification of Adventitious Sounds Combining Cochleogram and Vision Transformers.结合耳蜗图和视觉Transformer的附加音分类
Sensors (Basel). 2024 Jan 21;24(2):682. doi: 10.3390/s24020682.
8
Fine-Grained Video Captioning via Graph-based Multi-Granularity Interaction Learning.基于图的多粒度交互学习的细粒度视频字幕生成。
IEEE Trans Pattern Anal Mach Intell. 2022 Feb;44(2):666-683. doi: 10.1109/TPAMI.2019.2946823. Epub 2022 Jan 7.
9
SunoCaps: A novel dataset of text-prompt based AI-generated music with emotion annotations.SunoCaps:一个基于文本提示的带有情感注释的人工智能生成音乐的新颖数据集。
Data Brief. 2024 Jul 18;55:110743. doi: 10.1016/j.dib.2024.110743. eCollection 2024 Aug.
10
KFWC: A Knowledge-Driven Deep Learning Model for Fine-grained Classification of Wet-AMD.KFWC:一种面向湿性年龄相关性黄斑变性精细分类的知识驱动深度学习模型。
Comput Methods Programs Biomed. 2023 Feb;229:107312. doi: 10.1016/j.cmpb.2022.107312. Epub 2022 Dec 15.

本文引用的文献

1
A Lightweight Deep Learning-Based Approach for Jazz Music Generation in MIDI Format.一种基于轻量级深度学习的 MIDI 格式爵士音乐生成方法。
Comput Intell Neurosci. 2022 Aug 5;2022:2140895. doi: 10.1155/2022/2140895. eCollection 2022.
2
Computational Creativity and Music Generation Systems: An Introduction to the State of the Art.计算创造力与音乐生成系统:技术现状介绍
Front Artif Intell. 2020 Apr 3;3:14. doi: 10.3389/frai.2020.00014. eCollection 2020.
3
node2vec: Scalable Feature Learning for Networks.节点2向量:网络的可扩展特征学习
KDD. 2016 Aug;2016:855-864. doi: 10.1145/2939672.2939754.