Suppr超能文献

基于面部表情和远程光电容积脉搏波信号的端到端多模态情感识别

End-to-End Multimodal Emotion Recognition Based on Facial Expressions and Remote Photoplethysmography Signals.

作者信息

Li Jixiang, Peng Jianxin

出版信息

IEEE J Biomed Health Inform. 2024 Oct;28(10):6054-6063. doi: 10.1109/JBHI.2024.3430310. Epub 2024 Oct 3.

Abstract

Emotion is a complex physiological phenomenon, and a single modality may be insufficient for accurately determining human emotional states. This paper proposes an end-to-end multimodal emotion recognition method based on facial expressions and non-contact physiological signals. Facial expression features and remote photoplethysmography (rPPG) signals are extracted from facial video data, and a transformer-based cross-modal attention mechanism (TCMA) is used to learn the correlation between the two modalities. The results show that the accuracy of emotion recognition can be slightly improved by combining facial expressions with accurate rPPG signals. The performance is further improved with the use of TCMA, for which the binary classification accuracy of valence and arousal is 91.11% and 90.00%, respectively. Additionally, when experiments are conducted using the whole dataset, an increased accuracy of 7.31% and 4.23% for the binary classification of valence and arousal, and an improved accuracy of 5.36% for the four classifications of valence-arousal are achieved when TCMA is used in modal fusion, compared to using only facial expression modality, which fully demonstrates the effectiveness and robustness of TCMA. This method makes it possible to realize multimodal emotion recognition of facial expressions and contactless physiological signals in reality.

摘要

情绪是一种复杂的生理现象,单一模态可能不足以准确确定人类的情绪状态。本文提出了一种基于面部表情和非接触式生理信号的端到端多模态情绪识别方法。从面部视频数据中提取面部表情特征和远程光电容积脉搏波描记法(rPPG)信号,并使用基于Transformer的跨模态注意力机制(TCMA)来学习这两种模态之间的相关性。结果表明,将面部表情与准确的rPPG信号相结合可以略微提高情绪识别的准确率。使用TCMA可进一步提高性能,其效价和唤醒度的二元分类准确率分别为91.11%和90.00%。此外,在使用整个数据集进行实验时,与仅使用面部表情模态相比,在模态融合中使用TCMA时,效价和唤醒度的二元分类准确率分别提高了7.31%和4.23%,效价-唤醒度的四分类准确率提高了5.36%,充分证明了TCMA的有效性和鲁棒性。该方法使得在现实中实现面部表情和非接触式生理信号的多模态情绪识别成为可能。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验