Wang Dandan, Zhao Xiaoming
Department of Computer Science, Taizhou University, Taizhou, China.
Front Neurosci. 2022 Aug 26;16:984404. doi: 10.3389/fnins.2022.984404. eCollection 2022.
Traditional video recommendation provides the viewers with customized media content according to their historical records (e.g., ratings, reviews). However, such systems tend to generate terrible results if the data is insufficient, which leads to a cold-start problem. An affective video recommender system (AVRS) is a multidiscipline and multimodal human-robot interaction (HRI) system, and it incorporates physical, physiological, neuroscience, and computer science subjects and multimedia resources, including text, audio, and video. As a promising research domain, AVRS employs advanced affective analysis technologies in video resources; therefore, it can solve the cold-start problem. In AVRS, the viewers' emotional responses can be obtained from various techniques, including physical signals (e.g., facial expression, gestures, and speech) and internal signals (e.g., physiological signals). The changes in these signals can be detected when the viewers face specific situations. The physiological signals are a response to central and autonomic nervous systems and are mostly involuntarily activated, which cannot be easily controlled. Therefore, it is suitable for reliable emotion analysis. The physical signals can be recorded by a webcam or recorder. In contrast, the physiological signals can be collected by various equipment, e.g., psychophysiological heart rate (HR) signals calculated by echocardiogram (ECG), electro-dermal activity (EDA), and brain activity (GA) from electroencephalography (EEG) signals, skin conductance response (SCR) by a galvanic skin response (GSR), and photoplethysmography (PPG) estimating users' pulse. This survey aims to provide a comprehensive overview of the AVRS domain. To analyze the recent efforts in the field of affective video recommendation, we collected 92 relevant published articles from Google Scholar and summarized the articles and their key findings. In this survey, we feature these articles concerning AVRS from different perspectives, including various traditional recommendation algorithms and advanced deep learning-based algorithms, the commonly used affective video recommendation databases, audience response categories, and evaluation methods. Finally, we conclude the challenge of AVRS and provide the potential future research directions.
传统的视频推荐系统根据观众的历史记录(如评分、评论)为他们提供定制的媒体内容。然而,如果数据不足,此类系统往往会产生糟糕的结果,这就导致了冷启动问题。情感视频推荐系统(AVRS)是一种多学科、多模态的人机交互(HRI)系统,它融合了物理、生理、神经科学和计算机科学等学科以及文本、音频和视频等多媒体资源。作为一个有前景的研究领域,AVRS在视频资源中采用了先进的情感分析技术;因此,它可以解决冷启动问题。在AVRS中,可以通过各种技术获取观众的情感反应,包括物理信号(如面部表情、手势和语音)和内部信号(如生理信号)。当观众面对特定情况时,可以检测到这些信号的变化。生理信号是对中枢神经系统和自主神经系统的反应,大多是不由自主地被激活,不易被控制。因此,它适用于可靠的情感分析。物理信号可以通过网络摄像头或录音机进行记录。相比之下,生理信号可以通过各种设备进行收集,例如通过心电图(ECG)计算的心理生理心率(HR)信号、皮肤电活动(EDA)以及脑电图(EEG)信号中的脑活动(GA)、通过皮肤电反应(GSR)得到的皮肤电导反应(SCR),以及用于估计用户脉搏的光电容积脉搏波描记法(PPG)。本综述旨在全面概述AVRS领域。为了分析情感视频推荐领域的最新研究成果,我们从谷歌学术搜索中收集了92篇相关的已发表文章,并对这些文章及其主要发现进行了总结。在本综述中,我们从不同角度介绍了这些关于AVRS的文章,包括各种传统推荐算法和基于深度学习的先进算法、常用的情感视频推荐数据库以及观众反应类别和评估方法。最后,我们总结了AVRS面临的挑战,并给出了未来潜在的研究方向。