Suppr超能文献

基于生理启发的时间编码和基于注意力的课程学习从远程光电容积脉搏波中进行情绪识别。

Emotion Recognition from rPPG via Physiologically Inspired Temporal Encoding and Attention-Based Curriculum Learning.

作者信息

Lee Changmin, Lee Hyunwoo, Whang Mincheol

机构信息

Department of Human-Centered Artificial Intelligence, Sangmyung University, Seoul 03016, Republic of Korea.

Department of Emotion Engineering, Sangmyung University, Seoul 03016, Republic of Korea.

出版信息

Sensors (Basel). 2025 Jun 26;25(13):3995. doi: 10.3390/s25133995.

Abstract

Remote photoplethysmography (rPPG) enables non-contact physiological measurement for emotion recognition, yet the temporally sparse nature of emotional cardiovascular responses, intrinsic measurement noise, weak session-level labels, and subtle correlates of valence pose critical challenges. To address these issues, we propose a physiologically inspired deep learning framework comprising a Multi-scale Temporal Dynamics Encoder (MTDE) to capture autonomic nervous system dynamics across multiple timescales, an adaptive sparse α-Entmax attention mechanism to identify salient emotional segments amidst noisy signals, Gated Temporal Pooling for the robust aggregation of emotional features, and a structured three-phase curriculum learning strategy to systematically handle temporal sparsity, weak labels, and noise. Evaluated on the MAHNOB-HCI dataset (27 subjects and 527 sessions with a subject-mixed split), our temporal-only model achieved competitive performance in arousal recognition (66.04% accuracy; 61.97% weighted F1-score), surpassing prior CNN-LSTM baselines. However, lower performance in valence (62.26% accuracy) revealed inherent physiological limitations regarding a unimodal temporal cardiovascular analysis. These findings establish clear benchmarks for temporal-only rPPG emotion recognition and underscore the necessity of incorporating spatial or multimodal information to effectively capture nuanced emotional dimensions such as valence, guiding future research directions in affective computing.

摘要

远程光电容积脉搏波描记法(rPPG)能够进行非接触式生理测量以用于情绪识别,然而情绪性心血管反应在时间上的稀疏特性、内在测量噪声、薄弱的会话级标签以及效价的微妙关联带来了严峻挑战。为解决这些问题,我们提出了一个受生理启发的深度学习框架,它包括一个多尺度时间动态编码器(MTDE)以捕捉多个时间尺度上的自主神经系统动态,一个自适应稀疏α-Entmax注意力机制以在噪声信号中识别显著的情绪片段,门控时间池化用于情绪特征的稳健聚合,以及一种结构化的三阶段课程学习策略以系统地处理时间稀疏性、薄弱标签和噪声。在MAHNOB-HCI数据集(27名受试者和527个会话,采用受试者混合分割)上进行评估时,我们的仅基于时间的模型在唤醒识别方面取得了有竞争力的性能(准确率66.04%;加权F1分数61.97%),超过了先前的CNN-LSTM基线。然而,在效价方面较低的性能(准确率62.26%)揭示了单峰时间心血管分析存在的内在生理局限性。这些发现为仅基于时间的rPPG情绪识别建立了明确的基准,并强调了纳入空间或多模态信息以有效捕捉细微情绪维度(如效价)的必要性,为情感计算的未来研究方向提供了指导。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2493/12251639/1db5fbdd7172/sensors-25-03995-g005.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验