• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

关注不确定性:使用视频进行创伤后应激障碍检测的随机多模态变换模型。

Paying attention to uncertainty: A stochastic multimodal transformers for post-traumatic stress disorder detection using video.

机构信息

Université Paris-Est Créteil (UPEC), LISSI, 120, Rue Paul Armangot, Vitry-sur-Seine, 94400, France.

Université Paris-Est Créteil (UPEC), LISSI, 120, Rue Paul Armangot, Vitry-sur-Seine, 94400, France.

出版信息

Comput Methods Programs Biomed. 2024 Dec;257:108439. doi: 10.1016/j.cmpb.2024.108439. Epub 2024 Sep 26.

DOI:10.1016/j.cmpb.2024.108439
PMID:39340932
Abstract

BACKGROUND AND OBJECTIVES

Post-traumatic stress disorder is a debilitating psychological condition that can manifest following exposure to traumatic events. It affects individuals from diverse backgrounds and is associated with various symptoms, including intrusive thoughts, nightmares, hyperarousal, and avoidance behaviors.

METHODS

To address this challenge this study proposes a decision support system powered by a novel multimodal deep learning approach, based on a stochastic Transformer and video data. This Transformer has the ability to take advantage of its stochastic activation function and layers that allow it to learn sparse representations of the inputs. The method leverages a combination of low-level features extracted using three modalities, including Mel-frequency cepstral coefficients extracted from audio recordings, Facial Action Units captured from facial expressions, and textual data obtained from the audio transcription. By considering these modalities, our proposed model captures a comprehensive range of information related to post-traumatic stress disorder symptoms, including vocal cues, facial expressions, and linguistic content.

RESULTS

The deep learning model was trained and evaluated on the eDAIC dataset, which consists of clinical interviews with individuals with and without post-traumatic disorder. The model achieved state-of-the-art results, demonstrating its effectiveness in accurately detecting PTSD, showing an impressive Root Mean Square Error of 1.98, and a Concordance Correlation Coefficient of 0.722, signifying the model's superior performance compared to existing approaches.

CONCLUSION

This work introduces a new method for post-traumatic stress disorder detection from videos by utilizing a multimodal stochastic Transformer model. The model makes use of a variety of modalities, such as text, audio, and visual data, to gather comprehensive and varied information in order to make the detection.

摘要

背景与目的

创伤后应激障碍是一种使人虚弱的心理状况,可能在经历创伤事件后出现。它影响来自不同背景的个体,与多种症状相关,包括侵入性思维、噩梦、过度警觉和回避行为。

方法

为了解决这一挑战,本研究提出了一个决策支持系统,该系统由一种新颖的基于随机 Transformer 和视频数据的多模态深度学习方法提供支持。这种 Transformer 具有利用其随机激活函数和允许其学习输入稀疏表示的层的能力。该方法利用了三种模态提取的低水平特征的组合,包括从音频记录中提取的梅尔频率倒谱系数、从面部表情中捕获的面部动作单元以及从音频转录中获得的文本数据。通过考虑这些模态,我们提出的模型捕捉到了与创伤后应激障碍症状相关的广泛信息,包括声音线索、面部表情和语言内容。

结果

深度学习模型在 eDAIC 数据集上进行了训练和评估,该数据集由有和没有创伤后障碍的个体的临床访谈组成。该模型取得了最先进的结果,证明了其在准确检测 PTSD 方面的有效性,显示出令人印象深刻的均方根误差为 1.98,协调相关系数为 0.722,这表明与现有方法相比,该模型具有优越的性能。

结论

本工作通过利用多模态随机 Transformer 模型,提出了一种从视频中检测创伤后应激障碍的新方法。该模型利用多种模态,如文本、音频和视觉数据,以收集全面和多样化的信息来进行检测。

相似文献

1
Paying attention to uncertainty: A stochastic multimodal transformers for post-traumatic stress disorder detection using video.关注不确定性:使用视频进行创伤后应激障碍检测的随机多模态变换模型。
Comput Methods Programs Biomed. 2024 Dec;257:108439. doi: 10.1016/j.cmpb.2024.108439. Epub 2024 Sep 26.
2
Multimodal Sensing for Depression Risk Detection: Integrating Audio, Video, and Text Data.多模态感知用于抑郁风险检测:音频、视频和文本数据的融合。
Sensors (Basel). 2024 Jun 7;24(12):3714. doi: 10.3390/s24123714.
3
AVaTER: Fusing Audio, Visual, and Textual Modalities Using Cross-Modal Attention for Emotion Recognition.AVaTER:使用跨模态注意力融合音频、视觉和文本模态进行情感识别。
Sensors (Basel). 2024 Sep 10;24(18):5862. doi: 10.3390/s24185862.
4
Cognitive decline assessment using semantic linguistic content and transformer deep learning architecture.使用语义语言内容和变压器深度学习架构评估认知能力下降。
Int J Lang Commun Disord. 2024 May-Jun;59(3):1110-1127. doi: 10.1111/1460-6984.12973. Epub 2023 Nov 16.
5
Integrating audio and visual modalities for multimodal personality trait recognition hybrid deep learning.整合音频和视觉模态用于多模态人格特质识别——混合深度学习
Front Neurosci. 2023 Jan 6;16:1107284. doi: 10.3389/fnins.2022.1107284. eCollection 2022.
6
SMaTE: A Segment-Level Feature Mixing and Temporal Encoding Framework for Facial Expression Recognition.SMaTE:一种用于面部表情识别的分段级特征混合和时间编码框架。
Sensors (Basel). 2022 Aug 1;22(15):5753. doi: 10.3390/s22155753.
7
A novel deep learning model based on transformer and cross modality attention for classification of sleep stages.一种基于 Transformer 和跨模态注意力的新型深度学习模型,用于睡眠阶段分类。
J Biomed Inform. 2024 Sep;157:104689. doi: 10.1016/j.jbi.2024.104689. Epub 2024 Jul 18.
8
Psychological disorder detection: A multimodal approach using a transformer-based hybrid model.心理障碍检测:一种使用基于Transformer的混合模型的多模态方法。
MethodsX. 2024 Sep 24;13:102976. doi: 10.1016/j.mex.2024.102976. eCollection 2024 Dec.
9
Automated detection of steps in videos of strabismus surgery using deep learning.使用深度学习自动检测斜视手术视频中的步骤。
BMC Ophthalmol. 2024 Jun 10;24(1):242. doi: 10.1186/s12886-024-03504-8.
10
CDGT: Constructing diverse graph transformers for emotion recognition from facial videos.构建用于面部视频情感识别的多样化图变换模型。
Neural Netw. 2024 Nov;179:106573. doi: 10.1016/j.neunet.2024.106573. Epub 2024 Jul 25.