Suppr超能文献

运用深度学习和多模态大型语言模型理解自然面部表情。

Understanding Naturalistic Facial Expressions with Deep Learning and Multimodal Large Language Models.

机构信息

Department of Experimental Psychology, University College London, London WC1H 0AP, UK.

Department of Mathematics and Computer Science, University of Bremen, 28359 Bremen, Germany.

出版信息

Sensors (Basel). 2023 Dec 26;24(1):126. doi: 10.3390/s24010126.

Abstract

This paper provides a comprehensive overview of affective computing systems for facial expression recognition (FER) research in naturalistic contexts. The first section presents an updated account of user-friendly FER toolboxes incorporating state-of-the-art deep learning models and elaborates on their neural architectures, datasets, and performances across domains. These sophisticated FER toolboxes can robustly address a variety of challenges encountered in the wild such as variations in illumination and head pose, which may otherwise impact recognition accuracy. The second section of this paper discusses multimodal large language models (MLLMs) and their potential applications in affective science. MLLMs exhibit human-level capabilities for FER and enable the quantification of various contextual variables to provide context-aware emotion inferences. These advancements have the potential to revolutionize current methodological approaches for studying the contextual influences on emotions, leading to the development of contextualized emotion models.

摘要

本文提供了一个全面的综述,介绍了情感计算系统在自然情境下的面部表情识别(FER)研究。第一部分介绍了更新的用户友好型 FER 工具箱,其中包含了最先进的深度学习模型,并详细介绍了它们的神经架构、数据集以及在不同领域的性能。这些复杂的 FER 工具箱可以稳健地解决在野外遇到的各种挑战,例如光照和头部姿势的变化,否则这些变化可能会影响识别的准确性。本文的第二部分讨论了多模态大型语言模型(MLLM)及其在情感科学中的潜在应用。MLLM 表现出了在 FER 方面的人类水平的能力,并能够量化各种上下文变量,从而提供情境感知的情感推断。这些进展有可能彻底改变当前研究情感的上下文影响的方法,从而开发情境化的情感模型。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0da4/10781259/d6a77c8c9c85/sensors-24-00126-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验