Dalmazzo David, Waddell George, Ramírez Rafael
Music Technology Group, Department of Information and Communication Technologies, Universitat Pompeu Fabra, Barcelona, Spain.
Centre for Performance Science, Royal College of Music, London, United Kingdom.
Front Psychol. 2021 Jan 5;11:575971. doi: 10.3389/fpsyg.2020.575971. eCollection 2020.
Repetitive practice is one of the most important factors in improving the performance of motor skills. This paper focuses on the analysis and classification of forearm gestures in the context of violin playing. We recorded five experts and three students performing eight traditional classical violin bow-strokes: , and . To record inertial motion information, we utilized the sensor, which reports a multidimensional time-series signal. We synchronized inertial motion recordings with audio data to extract the spatiotemporal dynamics of each gesture. Applying state-of-the-art deep neural networks, we implemented and compared different architectures where convolutional neural networks (CNN) models demonstrated recognition rates of 97.147%, 3DMultiHeaded_CNN models showed rates of 98.553%, and rates of 99.234% were demonstrated by CNN_LSTM models. The collected data (quaternion of the bowing arm of a violinist) contained sufficient information to distinguish the bowing techniques studied, and deep learning methods were capable of learning the movement patterns that distinguish these techniques. Each of the learning algorithms investigated (CNN, 3DMultiHeaded_CNN, and CNN_LSTM) produced high classification accuracies which supported the feasibility of training classifiers. The resulting classifiers may provide the foundation of a digital assistant to enhance musicians' time spent practicing alone, providing real-time feedback on the accuracy and consistency of their musical gestures in performance.
重复练习是提高运动技能表现的最重要因素之一。本文聚焦于小提琴演奏情境下前臂手势的分析与分类。我们记录了五名专家和三名学生演奏八种传统古典小提琴弓法: ,以及 。为记录惯性运动信息,我们使用了 传感器,它报告多维时间序列信号。我们将惯性运动记录与音频数据同步,以提取每个手势的时空动态。应用最先进的深度神经网络,我们实现并比较了不同架构,其中卷积神经网络(CNN)模型的识别率为97.147%,3D多头CNN模型的识别率为98.553%,而CNN_LSTM模型的识别率为99.234%。收集的数据(小提琴演奏者运弓手臂的四元数)包含足够信息来区分所研究的运弓技巧,深度学习方法能够学习区分这些技巧的运动模式。所研究的每种学习算法(CNN、3D多头CNN和CNN_LSTM)都产生了较高的分类准确率,这支持了训练分类器的可行性。所得分类器可为数字助手奠定基础,以增加音乐家独自练习的时间,在演奏中就其音乐手势的准确性和一致性提供实时反馈。