Roa Dabike Gerardo, Cox Trevor J, Miller Alex J, Fazenda Bruno M, Graetzer Simone, Vos Rebecca R, Akeroyd Michael A, Firth Jennifer, Whitmer William M, Bannister Scott, Greasley Alinka, Barker Jon P
Acoustics Research Centre, University of Salford, UK.
Hearing Sciences, Mental Health and Clinical Neurosciences, School of Medicine, University of Nottingham, UK.
Data Brief. 2024 Dec 4;57:111199. doi: 10.1016/j.dib.2024.111199. eCollection 2024 Dec.
This paper presents the Cadenza Woodwind Dataset. This publicly available data is synthesised audio for woodwind quartets including renderings of each instrument in isolation. The data was created to be used as training data within Cadenza's second open machine learning challenge (CAD2) for the task on rebalancing classical music ensembles. The dataset is also intended for developing other music information retrieval (MIR) algorithms using machine learning. It was created because of the lack of large-scale datasets of classical woodwind music with separate audio for each instrument and permissive license for reuse. Music scores were selected from the OpenScore String Quartet corpus. These were rendered for two woodwind ensembles of (i) flute, oboe, clarinet and bassoon; and (ii) flute, oboe, alto saxophone and bassoon. This was done by a professional music producer using industry-standard software. Virtual instruments were used to create the audio for each instrument using software that interpreted expression markings in the score. Convolution reverberation was used to simulate a performance space and the ensembles mixed. The dataset consists of the audio and associated metadata.
本文介绍了卡丹扎木管乐器数据集。这个公开可用的数据是木管四重奏的合成音频,包括每种乐器单独的演奏。这些数据被创建出来用作卡丹扎第二届开放机器学习挑战赛(CAD2)中关于重新平衡古典音乐合奏任务的训练数据。该数据集也旨在用于开发其他使用机器学习的音乐信息检索(MIR)算法。创建这个数据集是因为缺乏具有每种乐器单独音频且有宽松再利用许可的大规模古典木管音乐数据集。乐谱选自开放乐谱弦乐四重奏语料库。这些乐谱为两个木管乐器合奏进行了演奏,(i)长笛、双簧管、单簧管和巴松管;以及(ii)长笛、双簧管、中音萨克斯管和巴松管。这是由一位专业音乐制作人使用行业标准软件完成的。使用能解读乐谱中表情标记的软件,通过虚拟乐器为每种乐器创建音频。使用卷积混响来模拟表演空间并将合奏混合。该数据集由音频和相关元数据组成。