Wang James Z, Zhao Sicheng, Wu Chenyan, Adams Reginald B, Newman Michelle G, Shafir Tal, Tsachor Rachelle
College of Information Sciences and Technology, The Pennsylvania State University, University Park, PA 16802 USA.
Beijing National Research Center for Information Science and Technology (BNRist), Tsinghua University, Beijing 100084, China.
Proc IEEE Inst Electr Electron Eng. 2023 Oct;111(10):1236-1286. doi: 10.1109/JPROC.2023.3273517. Epub 2023 May 23.
The emergence of artificial emotional intelligence technology is revolutionizing the fields of computers and robotics, allowing for a new level of communication and understanding of human behavior that was once thought impossible. While recent advancements in deep learning have transformed the field of computer vision, automated understanding of evoked or expressed emotions in visual media remains in its infancy. This foundering stems from the absence of a universally accepted definition of "emotion," coupled with the inherently subjective nature of emotions and their intricate nuances. In this article, we provide a comprehensive, multidisciplinary overview of the field of emotion analysis in visual media, drawing on insights from psychology, engineering, and the arts. We begin by exploring the psychological foundations of emotion and the computational principles that underpin the understanding of emotions from images and videos. We then review the latest research and systems within the field, accentuating the most promising approaches. We also discuss the current technological challenges and limitations of emotion analysis, underscoring the necessity for continued investigation and innovation. We contend that this represents a "Holy Grail" research problem in computing and delineate pivotal directions for future inquiry. Finally, we examine the ethical ramifications of emotion-understanding technologies and contemplate their potential societal impacts. Overall, this article endeavors to equip readers with a deeper understanding of the domain of emotion analysis in visual media and to inspire further research and development in this captivating and rapidly evolving field.
人工情感智能技术的出现正在彻底改变计算机和机器人领域,实现了曾经被认为不可能的新层次的人类行为沟通与理解。虽然深度学习的最新进展已经改变了计算机视觉领域,但对视觉媒体中诱发或表达的情感进行自动理解仍处于起步阶段。这种困境源于缺乏对“情感”的普遍接受的定义,以及情感固有的主观性及其错综复杂的细微差别。在本文中,我们借鉴心理学、工程学和艺术领域的见解,对视觉媒体中的情感分析领域进行了全面的多学科概述。我们首先探讨情感的心理学基础以及从图像和视频中理解情感的计算原理。然后,我们回顾该领域的最新研究和系统,突出最有前景的方法。我们还讨论了情感分析当前的技术挑战和局限性,强调持续研究和创新的必要性。我们认为这是计算领域的一个“圣杯”研究问题,并勾勒出未来研究的关键方向。最后,我们研究情感理解技术的伦理影响,并思考它们可能对社会产生的影响。总体而言,本文旨在让读者更深入地了解视觉媒体中的情感分析领域,并激发在这个引人入胜且迅速发展的领域进行进一步的研究和开发。