Russo Samuele, Tibermacine Imad Eddine, Randieri Cristian, Rabehi Abdelaziz, Alharbi Amal H, El-Kenawy El-Sayed M, Napoli Christian
Department of Psychology, Sapienza University of Rome, Rome, Italy.
Department of Computer, Automation and Management Engineering, Sapienza University of Rome, Rome, Italy.
Front Neurosci. 2025 Aug 28;19:1622194. doi: 10.3389/fnins.2025.1622194. eCollection 2025.
Facial Emotion Recognition (FER) enables smart environments and robots to adapt their behavior to a user's affective state. Translating those recognized emotions into ambient cues, such as colored lighting, can improve comfort and engagement in Ambient Assisted Living (AAL) settings.
We design a FER pipeline that combines a Spatial Transformer Network for pose-invariant region focusing with a novel Multiple Self-Attention (MSA) block comprising parallel attention heads and learned fusion weights. The MSA-enhanced block is inserted into a compact VGG-style backbone trained on the FER+ dataset using weighted sampling to counteract class imbalance. The resulting soft-max probabilities are linearly blended with prototype hues derived from a simplified Plutchik wheel to drive RGB lighting in real time.
The proposed VGGFac-STN-MSA model achieves 82.54% test accuracy on FER+, outperforming a CNN baseline and the reproduced Deep-Emotion architecture. Ablation shows that MSA contributes +1% accuracy. Continuous color blending yields smooth, intensity-aware lighting transitions in a proof-of-concept demo.
Our attention scheme is architecture-agnostic, adds minimal computational overhead, and markedly boosts FER accuracy on low-resolution faces. Coupling the probability distribution directly to the RGB gamut provides a fine-grained, perceptually meaningful channel for affect-adaptive AAL systems.
面部表情识别(FER)使智能环境和机器人能够根据用户的情感状态调整其行为。将这些识别出的情感转化为环境线索,如彩色灯光,可以提高环境辅助生活(AAL)环境中的舒适度和参与度。
我们设计了一个FER管道,它将用于姿态不变区域聚焦的空间变换器网络与一个新颖的多重自注意力(MSA)模块相结合,该模块包括并行注意力头和学习到的融合权重。MSA增强模块被插入到一个紧凑的VGG风格主干网络中,该主干网络在FER+数据集上训练,使用加权采样来抵消类别不平衡。由此产生的soft-max概率与从简化的普拉奇克轮导出的原型色调进行线性混合,以实时驱动RGB灯光。
所提出的VGGFac-STN-MSA模型在FER+上实现了82.54%的测试准确率,优于CNN基线和重现的深度情感架构。消融实验表明,MSA贡献了+1%的准确率。在概念验证演示中,连续颜色混合产生了平滑的、强度感知的灯光过渡。
我们的注意力方案与架构无关,增加的计算开销最小,并显著提高了低分辨率面部的FER准确率。将概率分布直接耦合到RGB色域为情感自适应AAL系统提供了一个细粒度的、感知上有意义的通道。