Suppr超能文献

基于多模态传感器数据的自动面部表情识别系统综述。

A Review on Automatic Facial Expression Recognition Systems Assisted by Multimodal Sensor Data.

机构信息

School of Information Technology, Deakin University, Burwood VIC 3125, Australia.

Data61, CSIRO, Battery Point TAS 7004, Australia.

出版信息

Sensors (Basel). 2019 Apr 18;19(8):1863. doi: 10.3390/s19081863.

Abstract

Facial Expression Recognition (FER) can be widely applied to various research areas, such as mental diseases diagnosis and human social/physiological interaction detection. With the emerging advanced technologies in hardware and sensors, FER systems have been developed to support real-world application scenes, instead of laboratory environments. Although the laboratory-controlled FER systems achieve very high accuracy, around 97%, the technical transferring from the laboratory to real-world applications faces a great barrier of very low accuracy, approximately 50%. In this survey, we comprehensively discuss three significant challenges in the unconstrained real-world environments, such as illumination variation, head pose, and subject-dependence, which may not be resolved by only analysing images/videos in the FER system. We focus on those sensors that may provide extra information and help the FER systems to detect emotion in both static images and video sequences. We introduce three categories of sensors that may help improve the accuracy and reliability of an expression recognition system by tackling the challenges mentioned above in pure image/video processing. The first group is detailed-face sensors, which detect a small dynamic change of a face component, such as eye-trackers, which may help differentiate the background noise and the feature of faces. The second is non-visual sensors, such as audio, depth, and EEG sensors, which provide extra information in addition to visual dimension and improve the recognition reliability for example in illumination variation and position shift situation. The last is target-focused sensors, such as infrared thermal sensors, which can facilitate the FER systems to filter useless visual contents and may help resist illumination variation. Also, we discuss the methods of fusing different inputs obtained from multimodal sensors in an emotion system. We comparatively review the most prominent multimodal emotional expression recognition approaches and point out their advantages and limitations. We briefly introduce the benchmark data sets related to FER systems for each category of sensors and extend our survey to the open challenges and issues. Meanwhile, we design a framework of an expression recognition system, which uses multimodal sensor data (provided by the three categories of sensors) to provide complete information about emotions to assist the pure face image/video analysis. We theoretically analyse the feasibility and achievability of our new expression recognition system, especially for the use in the wild environment, and point out the future directions to design an efficient, emotional expression recognition system.

摘要

面部表情识别 (FER) 可以广泛应用于各种研究领域,例如精神疾病诊断和人类社会/生理交互检测。随着硬件和传感器技术的不断发展,FER 系统已经被开发出来,以支持真实应用场景,而不仅仅是实验室环境。虽然实验室控制的 FER 系统能够达到非常高的准确率,约为 97%,但将技术从实验室转移到实际应用中存在着非常低的准确率(约为 50%)的巨大障碍。在本调查中,我们全面讨论了在无约束的真实环境中三个重大挑战,例如光照变化、头部姿势和个体依赖性,仅通过分析 FER 系统中的图像/视频可能无法解决这些挑战。我们重点关注那些可能提供额外信息并帮助 FER 系统在静态图像和视频序列中检测情感的传感器。我们介绍了三类传感器,它们可以通过解决上述图像/视频处理中提到的挑战,帮助提高表情识别系统的准确性和可靠性。第一类是详细面部传感器,它可以检测到面部组件的微小动态变化,例如眼动追踪器,它可以帮助区分背景噪声和面部特征。第二类是非视觉传感器,例如音频、深度和 EEG 传感器,除了视觉维度之外,它们还提供额外的信息,并提高识别可靠性,例如在光照变化和位置偏移的情况下。第三类是目标聚焦传感器,例如红外热传感器,它可以帮助 FER 系统过滤无用的视觉内容,并可能有助于抵抗光照变化。此外,我们还讨论了在情感系统中融合来自多模态传感器的不同输入的方法。我们比较了最突出的多模态情感表达识别方法,并指出了它们的优点和局限性。我们简要介绍了与每种类型的传感器相关的 FER 系统的基准数据集,并将调查扩展到开放的挑战和问题。同时,我们设计了一个表情识别系统的框架,该系统使用多模态传感器数据(由三类传感器提供)来提供关于情感的完整信息,以辅助纯面部图像/视频分析。我们从理论上分析了我们新的表情识别系统的可行性和可实现性,特别是在野外环境中的应用,并指出了设计高效、情感表达识别系统的未来方向。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验