Zhang Lingyu
Space Lifestyle Design, Kookmin University, Seoul, Republic of South Korea.
PeerJ Comput Sci. 2024 Oct 31;10:e2450. doi: 10.7717/peerj-cs.2450. eCollection 2024.
In interior interaction design, achieving intelligent user-interior interaction is contingent upon understanding the user's emotional responses. Precise identification of the user's visual emotions holds paramount importance. Current visual emotion recognition methods rely solely on singular features, predominantly facial expressions, resulting in inadequate coverage of visual characteristics and low recognition rates. This study introduces a deep learning-based multimodal weighting network model to address this challenge. The model initiates with a convolutional attention module, employing a self-attention mechanism within a convolutional neural network (CNN). As a result, the multimodal weighting network model is integrated to optimize weights during training. Finally, a weight network classifier is derived from these optimized weights to facilitate visual emotion recognition. Experimental outcomes reveal a 77.057% correctness rate and a 74.75% accuracy rate in visual emotion recognition. Comparative analysis against existing models demonstrates the superiority of the multimodal weight network model, showcasing its potential to enhance human-centric and intelligent indoor interaction design.
在室内交互设计中,实现智能用户与室内环境的交互取决于对用户情感反应的理解。精确识别用户的视觉情感至关重要。当前的视觉情感识别方法仅依赖单一特征,主要是面部表情,导致视觉特征覆盖不足且识别率较低。本研究引入了一种基于深度学习的多模态加权网络模型来应对这一挑战。该模型首先是一个卷积注意力模块,在卷积神经网络(CNN)中采用自注意力机制。结果,多模态加权网络模型被整合以在训练期间优化权重。最后,从这些优化后的权重中导出一个权重网络分类器以促进视觉情感识别。实验结果表明,视觉情感识别的正确率为77.057%,准确率为74.75%。与现有模型的对比分析证明了多模态权重网络模型的优越性,展示了其在增强以用户为中心的智能室内交互设计方面的潜力。