School of Electronic and Information Engineering, South China University of Technology, Guangzhou 510640, China.
Sensors (Basel). 2021 Mar 18;21(6):2136. doi: 10.3390/s21062136.
Sharing our feelings through content with images and short videos is one main way of expression on social networks. Visual content can affect people's emotions, which makes the task of analyzing the sentimental information of visual content more and more concerned. Most of the current methods focus on how to improve the local emotional representations to get better performance of sentiment analysis and ignore the problem of how to perceive objects of different scales and different emotional intensity in complex scenes. In this paper, based on the alterable scale and multi-level local regional emotional affinity analysis under the global perspective, we propose a multi-level context pyramid network (MCPNet) for visual sentiment analysis by combining local and global representations to improve the classification performance. Firstly, Resnet101 is employed as backbone to obtain multi-level emotional representation representing different degrees of semantic information and detailed information. Next, the multi-scale adaptive context modules (MACM) are proposed to learn the sentiment correlation degree of different regions for different scale in the image, and to extract the multi-scale context features for each level deep representation. Finally, different levels of context features are combined to obtain the multi-cue sentimental feature for image sentiment classification. Extensive experimental results on seven commonly used visual sentiment datasets illustrate that our method outperforms the state-of-the-art methods, especially the accuracy on the FI dataset exceeds 90%.
通过带有图像和短视频的内容来分享我们的感受是社交网络上的一种主要表达方式。视觉内容可以影响人们的情绪,这使得分析视觉内容的情感信息的任务越来越受到关注。目前的大多数方法都集中在如何改进局部情感表示以获得更好的情感分析性能,而忽略了如何在复杂场景中感知不同尺度和不同情感强度的对象的问题。在本文中,基于全局视角下可改变尺度和多层次局部区域情感亲和度分析,我们提出了一种多层上下文金字塔网络(MCPNet),通过结合局部和全局表示来提高分类性能,进行视觉情感分析。首先,使用 Resnet101 作为骨干网络,获得代表不同语义信息和详细信息程度的多层次情感表示。然后,提出多尺度自适应上下文模块(MACM),用于学习图像中不同尺度的不同区域的情感相关性,并提取每个层次深度表示的多尺度上下文特征。最后,结合不同层次的上下文特征,获得图像情感分类的多线索情感特征。在七个常用视觉情感数据集上的广泛实验结果表明,我们的方法优于最先进的方法,特别是在 FI 数据集上的准确率超过 90%。