• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

整合音频和视觉模态用于多模态人格特质识别——混合深度学习

Integrating audio and visual modalities for multimodal personality trait recognition hybrid deep learning.

作者信息

Zhao Xiaoming, Liao Yuehui, Tang Zhiwei, Xu Yicheng, Tao Xin, Wang Dandan, Wang Guoyu, Lu Hongsheng

机构信息

Taizhou Central Hospital (Taizhou University Hospital), Taizhou University, Taizhou, Zhejiang, China.

School of Computer Science, Hangzhou Dianzi University, Hangzhou, China.

出版信息

Front Neurosci. 2023 Jan 6;16:1107284. doi: 10.3389/fnins.2022.1107284. eCollection 2022.

DOI:10.3389/fnins.2022.1107284
PMID:36685221
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9853048/
Abstract

Recently, personality trait recognition, which aims to identify people's first impression behavior data and analyze people's psychological characteristics, has been an interesting and active topic in psychology, affective neuroscience and artificial intelligence. To effectively take advantage of spatio-temporal cues in audio-visual modalities, this paper proposes a new method of multimodal personality trait recognition integrating audio-visual modalities based on a hybrid deep learning framework, which is comprised of convolutional neural networks (CNN), bi-directional long short-term memory network (Bi-LSTM), and the Transformer network. In particular, a pre-trained deep audio CNN model is used to learn high-level segment-level audio features. A pre-trained deep face CNN model is leveraged to separately learn high-level frame-level global scene features and local face features from each frame in dynamic video sequences. Then, these extracted deep audio-visual features are fed into a Bi-LSTM and a Transformer network to individually capture long-term temporal dependency, thereby producing the final global audio and visual features for downstream tasks. Finally, a linear regression method is employed to conduct the single audio-based and visual-based personality trait recognition tasks, followed by a decision-level fusion strategy used for producing the final Big-Five personality scores and interview scores. Experimental results on the public ChaLearn First Impression-V2 personality dataset show the effectiveness of our method, outperforming other used methods.

摘要

最近,旨在识别人们的第一印象行为数据并分析人们心理特征的人格特质识别,已成为心理学、情感神经科学和人工智能领域一个有趣且活跃的话题。为了有效利用视听模态中的时空线索,本文提出了一种基于混合深度学习框架的融合视听模态的多模态人格特质识别新方法,该框架由卷积神经网络(CNN)、双向长短期记忆网络(Bi-LSTM)和Transformer网络组成。具体而言,使用预训练的深度音频CNN模型来学习高级片段级音频特征。利用预训练的深度面部CNN模型从动态视频序列中的每一帧分别学习高级帧级全局场景特征和局部面部特征。然后,将这些提取的深度视听特征输入到Bi-LSTM和Transformer网络中,以分别捕捉长期时间依赖性,从而为下游任务生成最终的全局音频和视觉特征。最后,采用线性回归方法进行基于单音频和单视觉的人格特质识别任务,随后采用决策级融合策略来生成最终的大五人格分数和面试分数。在公开的ChaLearn第一印象-V2人格数据集上的实验结果表明了我们方法的有效性,优于其他使用的方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/de41/9853048/897f643632d7/fnins-16-1107284-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/de41/9853048/843374b93e67/fnins-16-1107284-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/de41/9853048/897f643632d7/fnins-16-1107284-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/de41/9853048/843374b93e67/fnins-16-1107284-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/de41/9853048/897f643632d7/fnins-16-1107284-g002.jpg

相似文献

1
Integrating audio and visual modalities for multimodal personality trait recognition hybrid deep learning.整合音频和视觉模态用于多模态人格特质识别——混合深度学习
Front Neurosci. 2023 Jan 6;16:1107284. doi: 10.3389/fnins.2022.1107284. eCollection 2022.
2
End-to-end multimodal clinical depression recognition using deep neural networks: A comparative analysis.端到端使用深度神经网络进行多模态临床抑郁症识别:比较分析。
Comput Methods Programs Biomed. 2021 Nov;211:106433. doi: 10.1016/j.cmpb.2021.106433. Epub 2021 Sep 28.
3
Deep Personality Trait Recognition: A Survey.深度人格特质识别:一项综述
Front Psychol. 2022 May 6;13:839619. doi: 10.3389/fpsyg.2022.839619. eCollection 2022.
4
Prevalence and risk factors analysis of postpartum depression at early stage using hybrid deep learning model.采用混合深度学习模型分析早期产后抑郁的患病率及危险因素。
Sci Rep. 2024 Feb 24;14(1):4533. doi: 10.1038/s41598-024-54927-8.
5
Personality-Based Emotion Recognition Using EEG Signals with a CNN-LSTM Network.基于脑电信号的人格情绪识别:使用CNN-LSTM网络
Brain Sci. 2023 Jun 14;13(6):947. doi: 10.3390/brainsci13060947.
6
Neuronetwork Approach in the Early Diagnosis of Depression.神经网络方法在抑郁症早期诊断中的应用。
Psychiatr Danub. 2023 Oct;35(Suppl 2):77-85.
7
A multimodal convolutional neuro-fuzzy network for emotion understanding of movie clips.用于电影片段情绪理解的多模态卷积神经模糊网络。
Neural Netw. 2019 Oct;118:208-219. doi: 10.1016/j.neunet.2019.06.010. Epub 2019 Jul 2.
8
A comparative study on deep learning models for text classification of unstructured medical notes with various levels of class imbalance.深度学习模型在不同类别不平衡程度的非结构化医疗记录文本分类中的对比研究。
BMC Med Res Methodol. 2022 Jul 2;22(1):181. doi: 10.1186/s12874-022-01665-y.
9
An Incremental Class-Learning Approach with Acoustic Novelty Detection for Acoustic Event Recognition.基于声学新颖性检测的增量式类学习方法在声学事件识别中的应用。
Sensors (Basel). 2021 Oct 5;21(19):6622. doi: 10.3390/s21196622.
10
A novel hybrid deep learning IChOA-CNN-LSTM model for modality-enriched and multilingual emotion recognition in social media.一种新颖的混合深度学习 IChOA-CNN-LSTM 模型,用于社交媒体中丰富模态和多语言的情感识别。
Sci Rep. 2024 Sep 27;14(1):22270. doi: 10.1038/s41598-024-73452-2.

本文引用的文献

1
Affective video recommender systems: A survey.情感视频推荐系统:一项综述。
Front Neurosci. 2022 Aug 26;16:984404. doi: 10.3389/fnins.2022.984404. eCollection 2022.
2
Deep Personality Trait Recognition: A Survey.深度人格特质识别:一项综述
Front Psychol. 2022 May 6;13:839619. doi: 10.3389/fpsyg.2022.839619. eCollection 2022.
3
Deep Learning for Person Re-Identification: A Survey and Outlook.用于行人重识别的深度学习:综述与展望
IEEE Trans Pattern Anal Mach Intell. 2022 Jun;44(6):2872-2893. doi: 10.1109/TPAMI.2021.3054775. Epub 2022 May 5.
4
Affective Neuroscience Theory and Personality: An Update.情感神经科学理论与人格:最新进展
Personal Neurosci. 2018 Aug 10;1:e12. doi: 10.1017/pen.2018.10. eCollection 2018.
5
A Survey on Deep Learning for Multimodal Data Fusion.深度学习在多模态数据融合中的研究综述。
Neural Comput. 2020 May;32(5):829-864. doi: 10.1162/neco_a_01273. Epub 2020 Mar 18.
6
Deep learning.深度学习。
Nature. 2015 May 28;521(7553):436-44. doi: 10.1038/nature14539.
7
Automatic early stopping using cross validation: quantifying the criteria.使用交叉验证的自动早期停止:量化标准。
Neural Netw. 1998 Jun;11(4):761-767. doi: 10.1016/s0893-6080(98)00010-0.
8
Long short-term memory.长短期记忆
Neural Comput. 1997 Nov 15;9(8):1735-80. doi: 10.1162/neco.1997.9.8.1735.
9
An introduction to the five-factor model and its applications.五因素模型及其应用简介。
J Pers. 1992 Jun;60(2):175-215. doi: 10.1111/j.1467-6494.1992.tb00970.x.