Suppr超能文献

基于深度神经网络的音乐教学中的音频识别研究。

Research on Audio Recognition Based on the Deep Neural Network in Music Teaching.

机构信息

School of Music and Performing Arts, Mianyang Teachers' College, Mianyang 621000, China.

College of Management Science, Chengdu University of Technology, Chengdu 610059, China.

出版信息

Comput Intell Neurosci. 2022 May 27;2022:7055624. doi: 10.1155/2022/7055624. eCollection 2022.

Abstract

Solfeggio is an important basic course for music majors, and audio recognition training is one of the important links. With the improvement of computer performance, audio recognition has been widely used in smart wearable devices. In recent years, the development of deep learning has accelerated the research process of audio recognition. However, there is a lot of sound interference in music teaching environment, which leads to the performance of the audio classifier that cannot meet the actual demand. In order to solve this problem, an improved audio recognition system based on YOLO-v4 is proposed, which mainly improves the network structure. First, Mel frequency cepstrum number is used to process the original audio and extract the corresponding features. Then, try to apply the YOLO-v4 model in the field of deep learning to the field of audio recognition and improve it by combining with the spatial pyramid pool module to strengthen the generalization ability of data in different audio formats. Second, the stacking method in ensemble learning is used to fuse the independent submodels of two different channels. Experimental results show that compared with other deep learning technologies, the improved YOLO-v4 model can improve the performance of audio recognition, and it has better performance in processing data of different audio formats, which shows better generalization ability.

摘要

视唱练耳是音乐专业的重要基础课程,而音频识别训练是其中的重要环节之一。随着计算机性能的提高,音频识别已经在智能可穿戴设备中得到了广泛的应用。近年来,深度学习的发展加速了音频识别的研究进程。然而,在音乐教学环境中存在大量的声音干扰,这导致音频分类器的性能无法满足实际需求。为了解决这个问题,提出了一种基于 YOLO-v4 的改进音频识别系统,主要对网络结构进行了改进。首先,使用梅尔频率倒谱系数对原始音频进行处理并提取相应的特征。然后,尝试将 YOLO-v4 模型应用于深度学习领域,并结合空间金字塔池模块进行改进,以增强不同音频格式数据的泛化能力。其次,使用集成学习中的堆叠方法融合两个不同通道的独立子模型。实验结果表明,与其他深度学习技术相比,改进后的 YOLO-v4 模型可以提高音频识别的性能,并且在处理不同音频格式的数据时具有更好的性能,表现出更好的泛化能力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/43e1/9166999/96f6090ba453/CIN2022-7055624.001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验