• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于面部表情和远程光电容积脉搏波信号的端到端多模态情感识别

End-to-End Multimodal Emotion Recognition Based on Facial Expressions and Remote Photoplethysmography Signals.

作者信息

Li Jixiang, Peng Jianxin

出版信息

IEEE J Biomed Health Inform. 2024 Oct;28(10):6054-6063. doi: 10.1109/JBHI.2024.3430310. Epub 2024 Oct 3.

DOI:10.1109/JBHI.2024.3430310
PMID:39024092
Abstract

Emotion is a complex physiological phenomenon, and a single modality may be insufficient for accurately determining human emotional states. This paper proposes an end-to-end multimodal emotion recognition method based on facial expressions and non-contact physiological signals. Facial expression features and remote photoplethysmography (rPPG) signals are extracted from facial video data, and a transformer-based cross-modal attention mechanism (TCMA) is used to learn the correlation between the two modalities. The results show that the accuracy of emotion recognition can be slightly improved by combining facial expressions with accurate rPPG signals. The performance is further improved with the use of TCMA, for which the binary classification accuracy of valence and arousal is 91.11% and 90.00%, respectively. Additionally, when experiments are conducted using the whole dataset, an increased accuracy of 7.31% and 4.23% for the binary classification of valence and arousal, and an improved accuracy of 5.36% for the four classifications of valence-arousal are achieved when TCMA is used in modal fusion, compared to using only facial expression modality, which fully demonstrates the effectiveness and robustness of TCMA. This method makes it possible to realize multimodal emotion recognition of facial expressions and contactless physiological signals in reality.

摘要

情绪是一种复杂的生理现象,单一模态可能不足以准确确定人类的情绪状态。本文提出了一种基于面部表情和非接触式生理信号的端到端多模态情绪识别方法。从面部视频数据中提取面部表情特征和远程光电容积脉搏波描记法(rPPG)信号,并使用基于Transformer的跨模态注意力机制(TCMA)来学习这两种模态之间的相关性。结果表明,将面部表情与准确的rPPG信号相结合可以略微提高情绪识别的准确率。使用TCMA可进一步提高性能,其效价和唤醒度的二元分类准确率分别为91.11%和90.00%。此外,在使用整个数据集进行实验时,与仅使用面部表情模态相比,在模态融合中使用TCMA时,效价和唤醒度的二元分类准确率分别提高了7.31%和4.23%,效价-唤醒度的四分类准确率提高了5.36%,充分证明了TCMA的有效性和鲁棒性。该方法使得在现实中实现面部表情和非接触式生理信号的多模态情绪识别成为可能。

相似文献

1
End-to-End Multimodal Emotion Recognition Based on Facial Expressions and Remote Photoplethysmography Signals.基于面部表情和远程光电容积脉搏波信号的端到端多模态情感识别
IEEE J Biomed Health Inform. 2024 Oct;28(10):6054-6063. doi: 10.1109/JBHI.2024.3430310. Epub 2024 Oct 3.
2
A fine-grained human facial key feature extraction and fusion method for emotion recognition.一种用于情感识别的细粒度人类面部关键特征提取与融合方法。
Sci Rep. 2025 Feb 20;15(1):6153. doi: 10.1038/s41598-025-90440-2.
3
Using Facial Micro-Expressions in Combination With EEG and Physiological Signals for Emotion Recognition.结合面部微表情、脑电图和生理信号进行情绪识别
Front Psychol. 2022 Jun 28;13:864047. doi: 10.3389/fpsyg.2022.864047. eCollection 2022.
4
Multimodal emotion recognition by combining physiological signals and facial expressions: a preliminary study.结合生理信号和面部表情的多模态情感识别:一项初步研究。
Annu Int Conf IEEE Eng Med Biol Soc. 2012;2012:5238-41. doi: 10.1109/EMBC.2012.6347175.
5
Emotion Classification Based on Pulsatile Images Extracted from Short Facial Videos via Deep Learning.基于深度学习从短面部视频中提取的脉动图像进行情感分类
Sensors (Basel). 2024 Apr 19;24(8):2620. doi: 10.3390/s24082620.
6
Feature selection for multimodal emotion recognition in the arousal-valence space.在唤醒-效价空间中进行多模态情感识别的特征选择
Annu Int Conf IEEE Eng Med Biol Soc. 2013;2013:4330-3. doi: 10.1109/EMBC.2013.6610504.
7
Fusion of Facial Expressions and EEG for Multimodal Emotion Recognition.面部表情与 EEG 融合的多模态情绪识别
Comput Intell Neurosci. 2017;2017:2107451. doi: 10.1155/2017/2107451. Epub 2017 Sep 19.
8
Joint low-rank tensor fusion and cross-modal attention for multimodal physiological signals based emotion recognition.基于多模态生理信号的联合低秩张量融合和跨模态注意的情感识别。
Physiol Meas. 2024 Jul 11;45(7). doi: 10.1088/1361-6579/ad5bbc.
9
[Emotion Recognition Based on Multiple Physiological Signals].基于多种生理信号的情感识别
Zhongguo Yi Liao Qi Xie Za Zhi. 2020 Apr 8;44(4):283-287. doi: 10.3969/j.issn.1671-7104.2020.04.001.
10
ConDiff-rPPG: Robust Remote Physiological Measurement to Heterogeneous Occlusions.ConDiff-rPPG:针对异构遮挡的稳健远程生理测量
IEEE J Biomed Health Inform. 2024 Dec;28(12):7090-7102. doi: 10.1109/JBHI.2024.3433461. Epub 2024 Dec 5.