基于卷积神经网络模型的跨媒体音视频评分识别神经网络模型设计。

Xinyang Normal University, Xinyang, Henan 464000, China.

Comput Intell Neurosci. 2022 Jun 13;2022:4626867. doi: 10.1155/2022/4626867. eCollection 2022.

In this paper, the residual convolutional neural network is used to extract the note features in the music score image to solve the problem of model degradation; then, multiscale feature fusion is used to fuse the feature information of different levels in the same feature map to enhance the feature representation ability of the model. A network composed of a bidirectional simple loop unit and a chained time series classification function is used to identify notes, parallelizing a large number of calculations, thereby speeding up the convergence speed of training, which also makes the data in the dataset no longer need to be strict with labels. Alignment also reduces the requirements on the dataset. Aiming at the problem that the existing cross-modal retrieval methods based on common subspace are insufficient for mining local consistency within modalities, a cross-modal retrieval method fused with graph convolution is proposed. The K-nearest neighbor algorithm is used to construct modal graphs for samples of different modalities, and the original features of samples from different modalities are encoded through a symmetric graph convolutional coding network and a symmetric multilayer fully connected coding network, and the encoded features are fused and input. We jointly optimize the intramodal semantic constraints and intermodal modality-invariant constraints in the common subspace to learn highly locally consistent and semantically consistent common representations for samples from different modalities. The error value of the experimental results is used to illustrate the effect of parameters such as the number of iterations and the number of neurons on the network. In order to more accurately illustrate that the generated music sequence is very similar to the original music sequence, the generated music sequence is also framed, and finally the music sequence spectrogram and spectrogram are generated. The accuracy of the experiment is illustrated by comparing the spectrogram and the spectrogram, and genre classification predictions are also performed on the generated music to show that the network can generate music of different genres.

本文使用残差卷积神经网络提取乐谱图像中的音符特征，解决模型退化问题；然后使用多尺度特征融合，融合同一特征图中不同层次的特征信息，增强模型的特征表示能力。使用由双向简单循环单元和链式时间序列分类函数组成的网络来识别音符，并行化大量计算，从而加快训练的收敛速度，这也使得数据集中的数据不再需要严格的标签对齐，从而降低了对数据集的要求。针对基于公共子空间的现有跨模态检索方法对模态内局部一致性挖掘不足的问题，提出了一种融合图卷积的跨模态检索方法。使用 K-最近邻算法为不同模态的样本构建模态图，通过对称图卷积编码网络和对称多层全连接编码网络对来自不同模态的样本的原始特征进行编码，并融合编码特征进行输入。我们联合优化公共子空间中的模态内语义约束和模态不变约束，学习来自不同模态的样本的高度局部一致和语义一致的公共表示。实验结果的误差值用于说明迭代次数和神经元数量等参数对网络的影响。为了更准确地说明生成的音乐序列与原始音乐序列非常相似，还对生成的音乐序列进行了加框处理，最后生成音乐序列的频谱图和声谱图。通过比较频谱图和声谱图来说明实验的准确性，并对生成的音乐进行流派分类预测，以表明网络可以生成不同流派的音乐。

相似文献

Design of Neural Network Model for Cross-Media Audio and Video Score Recognition Based on Convolutional Neural Network Model.

Comput Intell Neurosci. 2022 Jun 13;2022:4626867. doi: 10.1155/2022/4626867. eCollection 2022.

A Multimodal Convolutional Neural Network Model for the Analysis of Music Genre on Children's Emotions Influence Intelligence.

Comput Intell Neurosci. 2022 Aug 29;2022:5611456. doi: 10.1155/2022/5611456. eCollection 2022.

A Music Emotion Classification Model Based on the Improved Convolutional Neural Network.

Comput Intell Neurosci. 2022 Feb 14;2022:6749622. doi: 10.1155/2022/6749622. eCollection 2022.

Construction of Music Intelligent Creation Model Based on Convolutional Neural Network.

Comput Intell Neurosci. 2022 Jul 5;2022:2854066. doi: 10.1155/2022/2854066. eCollection 2022.

A Multi-Modal Convolutional Neural Network Model for Intelligent Analysis of the Influence of Music Genres on Children's Emotions.

Comput Intell Neurosci. 2022 Jul 19;2022:4957085. doi: 10.1155/2022/4957085. eCollection 2022.

Comput Intell Neurosci. 2023 Feb 20;2023:1263620. doi: 10.1155/2023/1263620. eCollection 2023.

A Cross-Media Advertising Design and Communication Model Based on Feature Subspace Learning.

Comput Intell Neurosci. 2022 May 17;2022:5874722. doi: 10.1155/2022/5874722. eCollection 2022.

Variational Fuzzy Neural Network Algorithm for Music Intelligence Marketing Strategy Optimization.

Comput Intell Neurosci. 2022 Jan 6;2022:9051058. doi: 10.1155/2022/9051058. eCollection 2022.

Neural Network-Based Dynamic Segmentation and Weighted Integrated Matching of Cross-Media Piano Performance Audio Recognition and Retrieval Algorithm.

Comput Intell Neurosci. 2022 May 13;2022:9323646. doi: 10.1155/2022/9323646. eCollection 2022.

Design of Semiautomatic Digital Creation System for Electronic Music Based on Recurrent Neural Network.

Comput Intell Neurosci. 2022 Jun 27;2022:5457376. doi: 10.1155/2022/5457376. eCollection 2022.

引用本文的文献

The Generation of Piano Music Using Deep Learning Aided by Robotic Technology.

Comput Intell Neurosci. 2022 Oct 10;2022:8336616. doi: 10.1155/2022/8336616. eCollection 2022.

本文引用的文献

Video Captioning with Object-Aware Spatio-Temporal Correlation and Aggregation.

IEEE Trans Image Process. 2020 Apr 27. doi: 10.1109/TIP.2020.2988435.

MHTN: Modal-Adversarial Hybrid Transfer Network for Cross-Modal Retrieval.

IEEE Trans Cybern. 2020 Mar;50(3):1047-1059. doi: 10.1109/TCYB.2018.2879846. Epub 2018 Dec 5.

Online Data Organizer: Micro-Video Categorization by Structure-Guided Multimodal Dictionary Learning.

IEEE Trans Image Process. 2019 Mar;28(3):1235-1247. doi: 10.1109/TIP.2018.2875363. Epub 2018 Oct 10.

Suppr 超能文献

核心技术专利：CN118964589B侵权必究

相似文献

Design of Neural Network Model for Cross-Media Audio and Video Score Recognition Based on Convolutional Neural Network Model.

Comput Intell Neurosci. 2022 Jun 13;2022:4626867. doi: 10.1155/2022/4626867. eCollection 2022.

A Multimodal Convolutional Neural Network Model for the Analysis of Music Genre on Children's Emotions Influence Intelligence.

Comput Intell Neurosci. 2022 Aug 29;2022:5611456. doi: 10.1155/2022/5611456. eCollection 2022.

A Music Emotion Classification Model Based on the Improved Convolutional Neural Network.

Comput Intell Neurosci. 2022 Feb 14;2022:6749622. doi: 10.1155/2022/6749622. eCollection 2022.

Construction of Music Intelligent Creation Model Based on Convolutional Neural Network.

Comput Intell Neurosci. 2022 Jul 5;2022:2854066. doi: 10.1155/2022/2854066. eCollection 2022.

A Multi-Modal Convolutional Neural Network Model for Intelligent Analysis of the Influence of Music Genres on Children's Emotions.

Comput Intell Neurosci. 2022 Jul 19;2022:4957085. doi: 10.1155/2022/4957085. eCollection 2022.

Comput Intell Neurosci. 2023 Feb 20;2023:1263620. doi: 10.1155/2023/1263620. eCollection 2023.

A Cross-Media Advertising Design and Communication Model Based on Feature Subspace Learning.

Comput Intell Neurosci. 2022 May 17;2022:5874722. doi: 10.1155/2022/5874722. eCollection 2022.

Variational Fuzzy Neural Network Algorithm for Music Intelligence Marketing Strategy Optimization.

Comput Intell Neurosci. 2022 Jan 6;2022:9051058. doi: 10.1155/2022/9051058. eCollection 2022.

Neural Network-Based Dynamic Segmentation and Weighted Integrated Matching of Cross-Media Piano Performance Audio Recognition and Retrieval Algorithm.

Comput Intell Neurosci. 2022 May 13;2022:9323646. doi: 10.1155/2022/9323646. eCollection 2022.

Design of Semiautomatic Digital Creation System for Electronic Music Based on Recurrent Neural Network.

Comput Intell Neurosci. 2022 Jun 27;2022:5457376. doi: 10.1155/2022/5457376. eCollection 2022.

引用本文的文献

The Generation of Piano Music Using Deep Learning Aided by Robotic Technology.

Comput Intell Neurosci. 2022 Oct 10;2022:8336616. doi: 10.1155/2022/8336616. eCollection 2022.

本文引用的文献

Video Captioning with Object-Aware Spatio-Temporal Correlation and Aggregation.

IEEE Trans Image Process. 2020 Apr 27. doi: 10.1109/TIP.2020.2988435.

MHTN: Modal-Adversarial Hybrid Transfer Network for Cross-Modal Retrieval.

IEEE Trans Cybern. 2020 Mar;50(3):1047-1059. doi: 10.1109/TCYB.2018.2879846. Epub 2018 Dec 5.

Online Data Organizer: Micro-Video Categorization by Structure-Guided Multimodal Dictionary Learning.

IEEE Trans Image Process. 2019 Mar;28(3):1235-1247. doi: 10.1109/TIP.2018.2875363. Epub 2018 Oct 10.

Design of Neural Network Model for Cross-Media Audio and Video Score Recognition Based on Convolutional Neural Network Model.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献