IEEE Trans Cybern. 2022 May;52(5):3684-3695. doi: 10.1109/TCYB.2020.3014207. Epub 2022 May 19.
Music information retrieval is of great interest in audio signal processing. However, relatively little attention has been paid to the playing techniques of musical instruments. This work proposes an automatic system for classifying guitar playing techniques (GPTs). Automatic classification for GPTs is challenging because some playing techniques differ only slightly from others. This work presents a new framework for GPT classification: it uses a new feature extraction method based on spectral-temporal receptive fields (STRFs) to extract features from guitar sounds. This work applies a supervised deep learning approach to classify GPTs. Specifically, a new deep learning model, called the hierarchical cascade deep belief network (HCDBN), is proposed to perform automatic GPT classification. Several simulations were performed and the datasets of: 1) data on onsets of signals; 2) complete audio signals; and 3) audio signals in a real-world environment are adopted to compare the performance. The proposed system improves upon the F-score by approximately 11.47% in setup 1) and yields an F-score of 96.82% in setup 2). The results in setup 3) demonstrate that the proposed system also works well in a real-world environment. These results show that the proposed system is robust and has very high accuracy in automatic GPT classification.
音乐信息检索在音频信号处理中非常重要。然而,对于乐器的演奏技巧,相对较少的关注。这项工作提出了一种自动系统来对吉他演奏技巧(GPT)进行分类。由于一些演奏技巧与其他技巧仅略有不同,因此自动对 GPT 进行分类具有挑战性。这项工作提出了一种新的 GPT 分类框架:它使用基于谱时感受野(STRF)的新特征提取方法从吉他声音中提取特征。这项工作应用了监督深度学习方法来对 GPT 进行分类。具体来说,提出了一种新的深度学习模型,称为层次级联深度置信网络(HCDBN),用于执行自动 GPT 分类。进行了多次模拟,并采用了数据集 1)信号起始的数据;2)完整的音频信号;和 3)现实环境中的音频信号来比较性能。所提出的系统在设置 1)中提高了 F 分数约 11.47%,在设置 2)中产生了 96.82%的 F 分数。设置 3)中的结果表明,所提出的系统在现实环境中也能很好地工作。这些结果表明,所提出的系统在自动 GPT 分类中具有稳健性和非常高的准确性。