• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

视频情感识别的进展:挑战与趋势

Advances in Video Emotion Recognition: Challenges and Trends.

作者信息

Yi Yun, Zhou Yunkang, Wang Tinghua, Zhou Jin

机构信息

School of Mathematics and Computer Science, Gannan Normal University, Ganzhou 341000, China.

Key Laboratory of Data Science and Artificial Intelligence of Jiangxi Education Institutes, Gannan Normal University, Ganzhou 341000, China.

出版信息

Sensors (Basel). 2025 Jun 9;25(12):3615. doi: 10.3390/s25123615.

DOI:10.3390/s25123615
PMID:40573502
Abstract

Video emotion recognition (VER), situated at the convergence of affective computing and computer vision, aims to predict the primary emotion evoked in most viewers through video content, with extensive applications in video recommendation, human-computer interaction, and intelligent education. This paper commences with an analysis of the psychological models that constitute the foundation of VER theory. The paper further elaborates on datasets and evaluation metrics commonly utilized in VER. Then, the paper reviews VER algorithms according to their categories, and compares and analyzes the experimental results of classic methods on four datasets. Based on a comprehensive analysis and investigations, the paper identifies the prevailing challenges currently faced in the VER field, including gaps between emotional representations and labels, large-scale and high-quality VER datasets, and the efficient integration of multiple modalities. Furthermore, this study proposes potential research directions to address these challenges, e.g., advanced neural network architectures, efficient multimodal fusion strategies, high-quality emotional representation, and robust active learning strategies.

摘要

视频情感识别(VER)处于情感计算和计算机视觉的交叉领域,旨在通过视频内容预测大多数观众所唤起的主要情感,在视频推荐、人机交互和智能教育等方面有广泛应用。本文首先分析构成VER理论基础的心理模型。接着详细阐述VER中常用的数据集和评估指标。然后,根据类别对VER算法进行综述,并比较和分析经典方法在四个数据集上的实验结果。基于全面的分析和调查,本文确定了VER领域目前面临的主要挑战,包括情感表征与标签之间的差距、大规模高质量的VER数据集以及多模态的有效整合。此外,本研究提出了应对这些挑战的潜在研究方向,例如先进的神经网络架构、高效的多模态融合策略、高质量的情感表征和强大的主动学习策略。

相似文献

1
Advances in Video Emotion Recognition: Challenges and Trends.视频情感识别的进展:挑战与趋势
Sensors (Basel). 2025 Jun 9;25(12):3615. doi: 10.3390/s25123615.
2
New Trends in Emotion Recognition Using Image Analysis by Neural Networks, A Systematic Review.基于神经网络的图像分析的情绪识别新趋势:系统综述。
Sensors (Basel). 2023 Aug 10;23(16):7092. doi: 10.3390/s23167092.
3
EEG-based affective brain-computer interfaces: recent advancements and future challenges.基于脑电图的情感脑机接口:最新进展与未来挑战。
J Neural Eng. 2025 Jun 27;22(3). doi: 10.1088/1741-2552/ade290.
4
NeuroEmo: A neuroimaging-based fMRI dataset to extract temporal affective brain dynamics for Indian movie video clips stimuli using dynamic functional connectivity approach with graph convolution neural network (DFC-GCNN).NeuroEmo:一个基于神经成像的功能磁共振成像(fMRI)数据集,使用带有图卷积神经网络的动态功能连接方法(DFC-GCNN)从印度电影视频片段刺激中提取颞叶情感脑动力学。
Comput Biol Med. 2025 Aug;194:110439. doi: 10.1016/j.compbiomed.2025.110439. Epub 2025 Jun 12.
5
[Research on bimodal emotion recognition algorithm based on multi-branch bidirectional multi-scale time perception].基于多分支双向多尺度时间感知的双峰情感识别算法研究
Sheng Wu Yi Xue Gong Cheng Xue Za Zhi. 2025 Jun 25;42(3):528-536. doi: 10.7507/1001-5515.202404047.
6
Cross-corpus speech emotion recognition with transformers: Leveraging handcrafted features and data augmentation.基于 Transformer 的跨语料库语音情感识别:利用手工特征和数据增强。
Comput Biol Med. 2024 Sep;179:108841. doi: 10.1016/j.compbiomed.2024.108841. Epub 2024 Jul 12.
7
Advancing respiratory disease diagnosis: A deep learning and vision transformer-based approach with a novel X-ray dataset.推进呼吸系统疾病诊断:一种基于深度学习和视觉Transformer的方法及新型X射线数据集
Comput Biol Med. 2025 Aug;194:110501. doi: 10.1016/j.compbiomed.2025.110501. Epub 2025 Jun 9.
8
TUNeS: A Temporal U-Net With Self-Attention for Video-Based Surgical Phase Recognition.TUNeS:一种用于基于视频的手术阶段识别的带自注意力机制的时态U-Net。
IEEE Trans Biomed Eng. 2025 Jul;72(7):2105-2119. doi: 10.1109/TBME.2025.3535228.
9
Enhanced AlexNet with Gabor and Local Binary Pattern Features for Improved Facial Emotion Recognition.用于改进面部表情识别的具有Gabor和局部二值模式特征的增强型AlexNet
Sensors (Basel). 2025 Jun 19;25(12):3832. doi: 10.3390/s25123832.
10
Psychological interventions for adults who have sexually offended or are at risk of offending.针对有性犯罪行为或有性犯罪风险的成年人的心理干预措施。
Cochrane Database Syst Rev. 2012 Dec 12;12(12):CD007507. doi: 10.1002/14651858.CD007507.pub2.

本文引用的文献

1
MSDSANet: Multimodal Emotion Recognition Based on Multi-Stream Network and Dual-Scale Attention Network Feature Representation.MSDSANet:基于多流网络和双尺度注意力网络特征表示的多模态情感识别
Sensors (Basel). 2025 Mar 24;25(7):2029. doi: 10.3390/s25072029.
2
An Artificial Intelligence Model for Sensing Affective Valence and Arousal from Facial Images.一种用于从面部图像感知情感效价和唤醒度的人工智能模型。
Sensors (Basel). 2025 Feb 15;25(4):1188. doi: 10.3390/s25041188.
3
Decoding viewer emotions in video ads.解码视频广告中的观众情感。
Sci Rep. 2024 Nov 2;14(1):26382. doi: 10.1038/s41598-024-76968-9.
4
Face and context integration in emotion inference is limited and variable across categories and individuals.在情绪推断中,面孔和情境的整合在类别和个体之间是有限且可变的。
Nat Commun. 2024 Mar 19;15(1):2443. doi: 10.1038/s41467-024-46670-5.
5
Hierarchical attention network with progressive feature fusion for facial expression recognition.基于渐进式特征融合的层次注意力网络的表情识别。
Neural Netw. 2024 Feb;170:337-348. doi: 10.1016/j.neunet.2023.11.033. Epub 2023 Nov 14.
6
AttendAffectNet-Emotion Prediction of Movie Viewers Using Multimodal Fusion with Self-Attention.使用带有自注意力机制的多模态融合方法预测电影观众的 AttendAffectNet-Emotion。
Sensors (Basel). 2021 Dec 14;21(24):8356. doi: 10.3390/s21248356.
7
Distinct dimensions of emotion in the human brain and their representation on the cortical surface.人类大脑中情绪的不同维度及其在皮质表面的表现。
Neuroimage. 2020 Nov 15;222:117258. doi: 10.1016/j.neuroimage.2020.117258. Epub 2020 Aug 13.
8
Self-report captures 27 distinct categories of emotion bridged by continuous gradients.自陈式评估捕捉到 27 种不同类别的情绪,这些情绪由连续的梯度连接。
Proc Natl Acad Sci U S A. 2017 Sep 19;114(38):E7900-E7909. doi: 10.1073/pnas.1702247114. Epub 2017 Sep 5.
9
The circumplex model of affect: an integrative approach to affective neuroscience, cognitive development, and psychopathology.情感环状模型:情感神经科学、认知发展与精神病理学的综合研究方法。
Dev Psychopathol. 2005 Summer;17(3):715-34. doi: 10.1017/S0954579405050340.
10
Long short-term memory.长短期记忆
Neural Comput. 1997 Nov 15;9(8):1735-80. doi: 10.1162/neco.1997.9.8.1735.