Suppr超能文献

基于Transformer的脑电驱动情感音乐自动生成

EEG-driven automatic generation of emotive music based on transformer.

作者信息

Jiang Hui, Chen Yu, Wu Di, Yan Jinlin

机构信息

School of Computer Science and Artificial Intelligence, Hefei Normal University, Hefei, China.

School of Computer Science and Artificial Intelligence, Wuhan Textile University, Wuhan, China.

出版信息

Front Neurorobot. 2024 Aug 19;18:1437737. doi: 10.3389/fnbot.2024.1437737. eCollection 2024.

Abstract

Utilizing deep features from electroencephalography (EEG) data for emotional music composition provides a novel approach for creating personalized and emotionally rich music. Compared to textual data, converting continuous EEG and music data into discrete units presents significant challenges, particularly the lack of a clear and fixed vocabulary for standardizing EEG and audio data. The lack of this standard makes the mapping relationship between EEG signals and musical elements (such as rhythm, melody, and emotion) blurry and complex. Therefore, we propose a method of using clustering to create discrete representations and using the Transformer model to reverse mapping relationships. Specifically, the model uses clustering labels to segment signals and independently encodes EEG and emotional music data to construct a vocabulary, thereby achieving discrete representation. A time series dictionary was developed using clustering algorithms, which more effectively captures and utilizes the temporal and structural relationships between EEG and audio data. In response to the insensitivity to temporal information in heterogeneous data, we adopted a multi head attention mechanism and positional encoding technology to enable the model to focus on information in different subspaces, thereby enhancing the understanding of the complex internal structure of EEG and audio data. In addition, to address the mismatch between local and global information in emotion driven music generation, we introduce an audio masking prediction loss learning method. Our method generates music that 20 On the indicator, a performance of 68.19% was achieved, which improved the score by 4.9% compared to other methods, indicating the effectiveness of this method.

摘要

利用脑电图(EEG)数据的深度特征进行情感音乐创作,为创作个性化且情感丰富的音乐提供了一种新方法。与文本数据相比,将连续的EEG和音乐数据转换为离散单元存在重大挑战,尤其是缺乏用于标准化EEG和音频数据的清晰且固定的词汇表。这种标准的缺失使得EEG信号与音乐元素(如节奏、旋律和情感)之间的映射关系模糊且复杂。因此,我们提出一种使用聚类来创建离散表示并使用Transformer模型来反向映射关系的方法。具体而言,该模型使用聚类标签对信号进行分割,并对EEG和情感音乐数据进行独立编码以构建词汇表,从而实现离散表示。使用聚类算法开发了一个时间序列字典,它能更有效地捕捉和利用EEG与音频数据之间的时间和结构关系。针对异构数据中对时间信息的不敏感性,我们采用多头注意力机制和位置编码技术,使模型能够关注不同子空间中的信息,从而增强对EEG和音频数据复杂内部结构的理解。此外,为了解决情感驱动音乐生成中局部和全局信息不匹配的问题,我们引入了一种音频掩码预测损失学习方法。我们的方法生成的音乐在指标上达到了68.19%的性能,与其他方法相比提高了4.9%的分数,表明了该方法的有效性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2d88/11366740/3e5fbd98b48a/fnbot-18-1437737-g0001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验