• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于交叉注意力和混合特征加权神经网络的大规模视频片段中的情感识别。

Emotion Recognition from Large-Scale Video Clips with Cross-Attention and Hybrid Feature Weighting Neural Networks.

机构信息

Key Laboratory of Intelligent Education Technology and Application of Zhejiang Province, Zhejiang Normal University, Jinhua 321004, China.

出版信息

Int J Environ Res Public Health. 2023 Jan 12;20(2):1400. doi: 10.3390/ijerph20021400.

DOI:10.3390/ijerph20021400
PMID:36674161
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9859118/
Abstract

The emotion of humans is an important indicator or reflection of their mental states, e.g., satisfaction or stress, and recognizing or detecting emotion from different media is essential to perform sequence analysis or for certain applications, e.g., mental health assessments, job stress level estimation, and tourist satisfaction assessments. Emotion recognition based on computer vision techniques, as an important method of detecting emotion from visual media (e.g., images or videos) of human behaviors with the use of plentiful emotional cues, has been extensively investigated because of its significant applications. However, most existing models neglect inter-feature interaction and use simple concatenation for feature fusion, failing to capture the crucial complementary gains between face and context information in video clips, which is significant in addressing the problems of emotion confusion and emotion misunderstanding. Accordingly, in this paper, to fully exploit the complementary information between face and context features, we present a novel cross-attention and hybrid feature weighting network to achieve accurate emotion recognition from large-scale video clips, and the proposed model consists of a dual-branch encoding (DBE) network, a hierarchical-attention encoding (HAE) network, and a deep fusion (DF) block. Specifically, the face and context encoding blocks in the DBE network generate the respective shallow features. After this, the HAE network uses the cross-attention (CA) block to investigate and capture the complementarity between facial expression features and their contexts via a cross-channel attention operation. The element recalibration (ER) block is introduced to revise the feature map of each channel by embedding global information. Moreover, the adaptive-attention (AA) block in the HAE network is developed to infer the optimal feature fusion weights and obtain the adaptive emotion features via a hybrid feature weighting operation. Finally, the DF block integrates these adaptive emotion features to predict an individual emotional state. Extensive experimental results of the CAER-S dataset demonstrate the effectiveness of our method, exhibiting its potential in the analysis of tourist reviews with video clips, estimation of job stress levels with visual emotional evidence, or assessments of mental healthiness with visual media.

摘要

人类的情感是其心理状态的重要指标或反映,例如满意或压力,而从不同媒体中识别或检测情感对于执行序列分析或某些应用(例如心理健康评估、工作压力水平估计和游客满意度评估)至关重要。基于计算机视觉技术的情感识别作为从人类行为的视觉媒体(例如图像或视频)中检测情感的重要方法,由于其重要的应用而得到了广泛的研究。然而,大多数现有模型忽略了特征之间的交互作用,并且使用简单的串联进行特征融合,无法捕捉视频剪辑中面部和上下文信息之间的关键互补增益,这对于解决情感混淆和情感误解问题非常重要。因此,在本文中,为了充分利用面部和上下文特征之间的互补信息,我们提出了一种新颖的交叉注意力和混合特征加权网络,以从大规模视频剪辑中实现准确的情感识别,所提出的模型由双分支编码 (DBE) 网络、层次注意力编码 (HAE) 网络和深度融合 (DF) 块组成。具体来说,DBE 网络中的面部和上下文编码块生成各自的浅层特征。之后,HAE 网络使用交叉注意力 (CA) 块通过交叉通道注意力操作研究和捕获面部表情特征与其上下文之间的互补性。引入元素重新校准 (ER) 块通过嵌入全局信息来修正每个通道的特征图。此外,HAE 网络中的自适应注意力 (AA) 块通过混合特征加权操作推断出最佳特征融合权重,并获得自适应情感特征。最后,DF 块将这些自适应情感特征集成起来,以预测个体的情绪状态。CAER-S 数据集的广泛实验结果证明了我们方法的有效性,表明其在分析带有视频剪辑的游客评论、利用视觉情感证据估计工作压力水平或利用视觉媒体评估心理健康方面具有潜力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ada6/9859118/756b1c330868/ijerph-20-01400-g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ada6/9859118/44bd4c68d17f/ijerph-20-01400-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ada6/9859118/adc6d58c9ca6/ijerph-20-01400-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ada6/9859118/92e66d9a43d5/ijerph-20-01400-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ada6/9859118/102a93edf851/ijerph-20-01400-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ada6/9859118/c7e34011ebcb/ijerph-20-01400-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ada6/9859118/a993e400a59b/ijerph-20-01400-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ada6/9859118/fef148904a36/ijerph-20-01400-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ada6/9859118/4e13f175ca00/ijerph-20-01400-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ada6/9859118/9cec22a4dfba/ijerph-20-01400-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ada6/9859118/a91a23987187/ijerph-20-01400-g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ada6/9859118/756b1c330868/ijerph-20-01400-g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ada6/9859118/44bd4c68d17f/ijerph-20-01400-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ada6/9859118/adc6d58c9ca6/ijerph-20-01400-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ada6/9859118/92e66d9a43d5/ijerph-20-01400-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ada6/9859118/102a93edf851/ijerph-20-01400-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ada6/9859118/c7e34011ebcb/ijerph-20-01400-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ada6/9859118/a993e400a59b/ijerph-20-01400-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ada6/9859118/fef148904a36/ijerph-20-01400-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ada6/9859118/4e13f175ca00/ijerph-20-01400-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ada6/9859118/9cec22a4dfba/ijerph-20-01400-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ada6/9859118/a91a23987187/ijerph-20-01400-g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ada6/9859118/756b1c330868/ijerph-20-01400-g011.jpg

相似文献

1
Emotion Recognition from Large-Scale Video Clips with Cross-Attention and Hybrid Feature Weighting Neural Networks.基于交叉注意力和混合特征加权神经网络的大规模视频片段中的情感识别。
Int J Environ Res Public Health. 2023 Jan 12;20(2):1400. doi: 10.3390/ijerph20021400.
2
A multimodal convolutional neuro-fuzzy network for emotion understanding of movie clips.用于电影片段情绪理解的多模态卷积神经模糊网络。
Neural Netw. 2019 Oct;118:208-219. doi: 10.1016/j.neunet.2019.06.010. Epub 2019 Jul 2.
3
A novel feature fusion network for multimodal emotion recognition from EEG and eye movement signals.一种用于从脑电图和眼动信号中进行多模态情感识别的新型特征融合网络。
Front Neurosci. 2023 Aug 3;17:1234162. doi: 10.3389/fnins.2023.1234162. eCollection 2023.
4
Hierarchical Attention-Based Multimodal Fusion Network for Video Emotion Recognition.基于分层注意力的多模态融合网络的视频情绪识别。
Comput Intell Neurosci. 2021 Sep 25;2021:5585041. doi: 10.1155/2021/5585041. eCollection 2021.
5
Spatial-frequency-temporal convolutional recurrent network for olfactory-enhanced EEG emotion recognition.基于空间频率-时间卷积循环网络的嗅觉增强脑电情感识别
J Neurosci Methods. 2022 Jul 1;376:109624. doi: 10.1016/j.jneumeth.2022.109624. Epub 2022 May 16.
6
Multi-Stream Convolution-Recurrent Neural Networks Based on Attention Mechanism Fusion for Speech Emotion Recognition.基于注意力机制融合的多流卷积循环神经网络用于语音情感识别
Entropy (Basel). 2022 Jul 26;24(8):1025. doi: 10.3390/e24081025.
7
MIFAD-Net: Multi-Layer Interactive Feature Fusion Network With Angular Distance Loss for Face Emotion Recognition.MIFAD-Net:用于面部表情识别的具有角距离损失的多层交互式特征融合网络
Front Psychol. 2021 Oct 22;12:762795. doi: 10.3389/fpsyg.2021.762795. eCollection 2021.
8
Emotion Recognition of Online Education Learners by Convolutional Neural Networks.基于卷积神经网络的在线教育学习者情感识别
Comput Intell Neurosci. 2022 Jun 9;2022:4316812. doi: 10.1155/2022/4316812. eCollection 2022.
9
Hierarchical Context-Based Emotion Recognition With Scene Graphs.基于场景图的分层上下文情感识别
IEEE Trans Neural Netw Learn Syst. 2024 Mar;35(3):3725-3739. doi: 10.1109/TNNLS.2022.3196831. Epub 2024 Feb 29.
10
Emotion recognition using spatial-temporal EEG features through convolutional graph attention network.基于卷积图注意网络的时空 EEG 特征的情绪识别。
J Neural Eng. 2023 Feb 14;20(1). doi: 10.1088/1741-2552/acb79e.

引用本文的文献

1
CNN-LSTM based emotion recognition using Chebyshev moment and K-fold validation with multi-library SVM.基于切比雪夫矩和多库支持向量机的K折验证,使用卷积神经网络-长短期记忆网络的情感识别
PLoS One. 2025 Apr 7;20(4):e0320058. doi: 10.1371/journal.pone.0320058. eCollection 2025.
2
A facial expression recognition network using hybrid feature extraction.一种使用混合特征提取的面部表情识别网络。
PLoS One. 2025 Jan 16;20(1):e0312359. doi: 10.1371/journal.pone.0312359. eCollection 2025.
3
VT-3DCapsNet: Visual tempos 3D-Capsule network for video-based facial expression recognition.

本文引用的文献

1
Framework for identifying and visualising emotional atmosphere in online learning environments in the COVID-19 Era.新冠疫情时代在线学习环境中情绪氛围识别与可视化框架
Appl Intell (Dordr). 2022;52(8):9406-9422. doi: 10.1007/s10489-021-02916-z. Epub 2022 Jan 6.
2
Hybrid Model-Based Emotion Contextual Recognition for Cognitive Assistance Services.基于混合模型的认知辅助服务情感语境识别。
IEEE Trans Cybern. 2022 May;52(5):3567-3576. doi: 10.1109/TCYB.2020.3013112. Epub 2022 May 19.
3
MATNet: Motion-Attentive Transition Network for Zero-Shot Video Object Segmentation.
VT-3DCapsNet:基于视频的面部表情识别的视觉时态 3D 胶囊网络。
PLoS One. 2024 Aug 23;19(8):e0307446. doi: 10.1371/journal.pone.0307446. eCollection 2024.
4
Multi-Input Speech Emotion Recognition Model Using Mel Spectrogram and GeMAPS.基于梅尔频谱图和 GeMAPS 的多输入语音情感识别模型。
Sensors (Basel). 2023 Feb 3;23(3):1743. doi: 10.3390/s23031743.
MATNet:用于零样本视频对象分割的运动注意力过渡网络
IEEE Trans Image Process. 2020 Aug 12;PP. doi: 10.1109/TIP.2020.3013162.
4
Semantic Neighborhood-Aware Deep Facial Expression Recognition.语义邻域感知深度面部表情识别
IEEE Trans Image Process. 2020 May 6. doi: 10.1109/TIP.2020.2991510.
5
Towards the automatic detection of social biomarkers in autism spectrum disorder: introducing the simulated interaction task (SIT).迈向自闭症谱系障碍社会生物标志物的自动检测:引入模拟互动任务(SIT)。
NPJ Digit Med. 2020 Feb 28;3:25. doi: 10.1038/s41746-020-0227-5. eCollection 2020.
6
Epileptic Seizure Detection in EEG Signals Using a Unified Temporal-Spectral Squeeze-and-Excitation Network.基于统一时频挤压激励网络的 EEG 信号癫痫发作检测
IEEE Trans Neural Syst Rehabil Eng. 2020 Apr;28(4):782-794. doi: 10.1109/TNSRE.2020.2973434. Epub 2020 Feb 12.
7
Context Based Emotion Recognition Using EMOTIC Dataset.基于上下文的情感识别使用 EMOTIC 数据集。
IEEE Trans Pattern Anal Mach Intell. 2020 Nov;42(11):2755-2766. doi: 10.1109/TPAMI.2019.2916866. Epub 2019 May 14.
8
Occlusion aware facial expression recognition using CNN with attention mechanism.基于带有注意力机制的卷积神经网络的遮挡感知面部表情识别
IEEE Trans Image Process. 2018 Dec 14. doi: 10.1109/TIP.2018.2886767.
9
Behavioral and Neuroimaging Evidence for Facial Emotion Recognition in Elderly Korean Adults with Mild Cognitive Impairment, Alzheimer's Disease, and Frontotemporal Dementia.患有轻度认知障碍、阿尔茨海默病和额颞叶痴呆的韩国老年成年人面部情绪识别的行为和神经影像学证据。
Front Aging Neurosci. 2017 Nov 30;9:389. doi: 10.3389/fnagi.2017.00389. eCollection 2017.
10
Recognizing Action Units for Facial Expression Analysis.用于面部表情分析的动作单元识别
IEEE Trans Pattern Anal Mach Intell. 2001 Feb;23(2):97-115. doi: 10.1109/34.908962.