• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于具有文本-音频融合特征的提示学习的对话中的多模态情感识别。

Multi-modal emotion recognition in conversation based on prompt learning with text-audio fusion features.

作者信息

Wu Yuezhou, Zhang Siling, Li Pengfei

机构信息

School of Computer Science, Civil Aviation Flight University of China, Guanghan, 618307, China.

出版信息

Sci Rep. 2025 Mar 14;15(1):8855. doi: 10.1038/s41598-025-89758-8.

DOI:10.1038/s41598-025-89758-8
PMID:40087340
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11909257/
Abstract

With the widespread adoption of interactive machine applications, Emotion Recognition in Conversations (ERC) technology has garnered increasing attention. Although existing methods have improved recognition accuracy by integrating structured data, language barriers and the scarcity of non-English resources limit their cross-lingual applications. In light of this, the MERC-PLTAF method proposed in this paper innovatively focuses on multimodal emotion recognition in conversations, aiming to overcome the limitations of single modality and language barriers through refined feature extraction and a sophisticated cross-fusion strategy. We conducted extensive validation on multiple English and Chinese datasets, and the experimental results demonstrate that this method not only significantly improves emotion recognition accuracy but also exhibits exceptional performance on the Chinese M3ED dataset, paving a new path for cross-lingual emotion recognition. This research not only advances the boundaries of emotion recognition technology but also lays a solid theoretical foundation and practical framework for creating more intelligent and human-centric interactive experiences.

摘要

随着交互式机器应用的广泛采用,对话中的情感识别(ERC)技术受到了越来越多的关注。尽管现有方法通过整合结构化数据提高了识别准确率,但语言障碍和非英语资源的稀缺限制了它们的跨语言应用。鉴于此,本文提出的MERC-PLTAF方法创新性地专注于对话中的多模态情感识别,旨在通过精细的特征提取和复杂的交叉融合策略克服单模态的局限性和语言障碍。我们在多个英文和中文数据集上进行了广泛的验证,实验结果表明,该方法不仅显著提高了情感识别准确率,而且在中国M3ED数据集上表现出色,为跨语言情感识别开辟了一条新路径。这项研究不仅拓展了情感识别技术的边界,也为创造更智能、以人类为中心的交互体验奠定了坚实的理论基础和实践框架。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c95a/11909257/0861f69de101/41598_2025_89758_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c95a/11909257/5c40179ee9e3/41598_2025_89758_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c95a/11909257/9c1043dc1ee2/41598_2025_89758_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c95a/11909257/582c2f14965c/41598_2025_89758_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c95a/11909257/e4e9e38ad24c/41598_2025_89758_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c95a/11909257/f930ffd9d2be/41598_2025_89758_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c95a/11909257/5721760533f6/41598_2025_89758_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c95a/11909257/a52f1e354af3/41598_2025_89758_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c95a/11909257/84ed72061c23/41598_2025_89758_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c95a/11909257/0861f69de101/41598_2025_89758_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c95a/11909257/5c40179ee9e3/41598_2025_89758_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c95a/11909257/9c1043dc1ee2/41598_2025_89758_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c95a/11909257/582c2f14965c/41598_2025_89758_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c95a/11909257/e4e9e38ad24c/41598_2025_89758_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c95a/11909257/f930ffd9d2be/41598_2025_89758_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c95a/11909257/5721760533f6/41598_2025_89758_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c95a/11909257/a52f1e354af3/41598_2025_89758_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c95a/11909257/84ed72061c23/41598_2025_89758_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c95a/11909257/0861f69de101/41598_2025_89758_Fig9_HTML.jpg

相似文献

1
Multi-modal emotion recognition in conversation based on prompt learning with text-audio fusion features.基于具有文本-音频融合特征的提示学习的对话中的多模态情感识别。
Sci Rep. 2025 Mar 14;15(1):8855. doi: 10.1038/s41598-025-89758-8.
2
AVaTER: Fusing Audio, Visual, and Textual Modalities Using Cross-Modal Attention for Emotion Recognition.AVaTER:使用跨模态注意力融合音频、视觉和文本模态进行情感识别。
Sensors (Basel). 2024 Sep 10;24(18):5862. doi: 10.3390/s24185862.
3
Cross-modal credibility modelling for EEG-based multimodal emotion recognition.基于 EEG 的多模态情感识别的跨模态可信度建模。
J Neural Eng. 2024 Apr 11;21(2). doi: 10.1088/1741-2552/ad3987.
4
Robust Multimodal Emotion Recognition from Conversation with Transformer-Based Crossmodality Fusion.基于 Transformer 的跨模态融合的对话中的稳健多模态情感识别。
Sensors (Basel). 2021 Jul 19;21(14):4913. doi: 10.3390/s21144913.
5
Emotion Recognition Using EEG Signals and Audiovisual Features with Contrastive Learning.基于对比学习的脑电信号与视听特征情感识别
Bioengineering (Basel). 2024 Oct 3;11(10):997. doi: 10.3390/bioengineering11100997.
6
Research on cross-modal emotion recognition based on multi-layer semantic fusion.基于多层语义融合的跨模态情感识别研究
Math Biosci Eng. 2024 Jan 17;21(2):2488-2514. doi: 10.3934/mbe.2024110.
7
A fine-grained human facial key feature extraction and fusion method for emotion recognition.一种用于情感识别的细粒度人类面部关键特征提取与融合方法。
Sci Rep. 2025 Feb 20;15(1):6153. doi: 10.1038/s41598-025-90440-2.
8
MMAgentRec, a personalized multi-modal recommendation agent with large language model.MMAgentRec,一个带有大语言模型的个性化多模态推荐代理。
Sci Rep. 2025 Apr 8;15(1):12062. doi: 10.1038/s41598-025-96458-w.
9
Exploring emotional climate recognition in peer conversations through bispectral features and affect dynamics.通过双谱特征和情感动态探索同伴对话中的情感氛围识别。
Comput Methods Programs Biomed. 2025 Jun;265:108695. doi: 10.1016/j.cmpb.2025.108695. Epub 2025 Mar 18.
10
Multimodal Emotion Recognition Based on Cascaded Multichannel and Hierarchical Fusion.基于级联多通道和分层融合的多模态情绪识别。
Comput Intell Neurosci. 2023 Jan 5;2023:9645611. doi: 10.1155/2023/9645611. eCollection 2023.

引用本文的文献

1
Cross-modal gated feature enhancement for multimodal emotion recognition in conversations.用于对话中多模态情感识别的跨模态门控特征增强
Sci Rep. 2025 Aug 16;15(1):30004. doi: 10.1038/s41598-025-11989-6.
2
A Comprehensive Review of Multimodal Emotion Recognition: Techniques, Challenges, and Future Directions.多模态情感识别综述:技术、挑战与未来方向
Biomimetics (Basel). 2025 Jun 27;10(7):418. doi: 10.3390/biomimetics10070418.

本文引用的文献

1
An emotion-sensitive dialogue policy for task-oriented dialogue system.面向任务的对话系统中的情感敏感对话策略。
Sci Rep. 2024 Aug 26;14(1):19759. doi: 10.1038/s41598-024-70463-x.
2
A review on sentiment analysis and emotion detection from text.关于文本情感分析与情绪检测的综述。
Soc Netw Anal Min. 2021;11(1):81. doi: 10.1007/s13278-021-00776-6. Epub 2021 Aug 28.