Suppr超能文献

基于边缘计算和长短时记忆递归神经网络的音乐演奏智能辅助系统。

Intelligent auxiliary system for music performance under edge computing and long short-term recurrent neural networks.

机构信息

KU School of Music, Lawrence, Kansas, United States of America.

出版信息

PLoS One. 2023 May 8;18(5):e0285496. doi: 10.1371/journal.pone.0285496. eCollection 2023.

Abstract

Music performance action generation can be applied in multiple real-world scenarios as a research hotspot in computer vision and cross-sequence analysis. However, the current generation methods of music performance actions have consistently ignored the connection between music and performance actions, resulting in a strong sense of separation between visual and auditory content. This paper first analyzes the attention mechanism, Recurrent Neural Network (RNN), and long and short-term RNN. The long and short-term RNN is suitable for sequence data with a strong temporal correlation. Based on this, the current learning method is improved. A new model that combines attention mechanisms and long and short-term RNN is proposed, which can generate performance actions based on music beat sequences. In addition, image description generative models with attention mechanisms are adopted technically. Combined with the RNN abstract structure that does not consider recursion, the abstract network structure of RNN-Long Short-Term Memory (LSTM) is optimized. Through music beat recognition and dance movement extraction technology, data resources are allocated and adjusted in the edge server architecture. The metric for experimental results and evaluation is the model loss function value. The superiority of the proposed model is mainly reflected in the high accuracy and low consumption rate of dance movement recognition. The experimental results show that the result of the loss function of the model is at least 0.00026, and the video effect is the best when the number of layers of the LSTM module in the model is 3, the node value is 256, and the Lookback value is 15. The new model can generate harmonious and prosperous performance action sequences based on ensuring the stability of performance action generation compared with the other three models of cross-domain sequence analysis. The new model has an excellent performance in combining music and performance actions. This paper has practical reference value for promoting the application of edge computing technology in intelligent auxiliary systems for music performance.

摘要

音乐表演动作生成可应用于多个现实场景,是计算机视觉和跨序列分析领域的研究热点。然而,目前的音乐表演动作生成方法一直忽略音乐与表演动作之间的联系,导致视觉与听觉内容之间存在强烈的分离感。本文首先分析了注意力机制、递归神经网络(RNN)和长短时记忆 RNN,长短时记忆 RNN 适用于具有强时间相关性的序列数据。在此基础上,改进了当前的学习方法,提出了一种新的注意力机制和长短时记忆 RNN 相结合的模型,可基于音乐节拍序列生成表演动作。此外,还采用了带有注意力机制的图像描述生成模型,结合不考虑递归的 RNN 抽象结构,对 RNN 长短时记忆(LSTM)的抽象网络结构进行优化。通过音乐节拍识别和舞蹈动作提取技术,在边缘服务器架构中分配和调整数据资源。实验结果和评价的度量标准是模型损失函数值。所提出模型的优势主要体现在舞蹈动作识别的高精度和低消耗率上。实验结果表明,模型损失函数的结果至少为 0.00026,当模型中 LSTM 模块的层数为 3、节点值为 256、Lookback 值为 15 时,视频效果最佳。与跨域序列分析的其他三个模型相比,新模型在保证表演动作生成稳定性的同时,能够生成和谐繁荣的表演动作序列。新模型在音乐与表演动作的结合方面表现出色。本文对促进边缘计算技术在音乐表演智能辅助系统中的应用具有实际参考价值。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dd21/10166492/7c36d0e7ff8b/pone.0285496.g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验