Suppr超能文献

关注不确定性:使用视频进行创伤后应激障碍检测的随机多模态变换模型。

Paying attention to uncertainty: A stochastic multimodal transformers for post-traumatic stress disorder detection using video.

机构信息

Université Paris-Est Créteil (UPEC), LISSI, 120, Rue Paul Armangot, Vitry-sur-Seine, 94400, France.

Université Paris-Est Créteil (UPEC), LISSI, 120, Rue Paul Armangot, Vitry-sur-Seine, 94400, France.

出版信息

Comput Methods Programs Biomed. 2024 Dec;257:108439. doi: 10.1016/j.cmpb.2024.108439. Epub 2024 Sep 26.

Abstract

BACKGROUND AND OBJECTIVES

Post-traumatic stress disorder is a debilitating psychological condition that can manifest following exposure to traumatic events. It affects individuals from diverse backgrounds and is associated with various symptoms, including intrusive thoughts, nightmares, hyperarousal, and avoidance behaviors.

METHODS

To address this challenge this study proposes a decision support system powered by a novel multimodal deep learning approach, based on a stochastic Transformer and video data. This Transformer has the ability to take advantage of its stochastic activation function and layers that allow it to learn sparse representations of the inputs. The method leverages a combination of low-level features extracted using three modalities, including Mel-frequency cepstral coefficients extracted from audio recordings, Facial Action Units captured from facial expressions, and textual data obtained from the audio transcription. By considering these modalities, our proposed model captures a comprehensive range of information related to post-traumatic stress disorder symptoms, including vocal cues, facial expressions, and linguistic content.

RESULTS

The deep learning model was trained and evaluated on the eDAIC dataset, which consists of clinical interviews with individuals with and without post-traumatic disorder. The model achieved state-of-the-art results, demonstrating its effectiveness in accurately detecting PTSD, showing an impressive Root Mean Square Error of 1.98, and a Concordance Correlation Coefficient of 0.722, signifying the model's superior performance compared to existing approaches.

CONCLUSION

This work introduces a new method for post-traumatic stress disorder detection from videos by utilizing a multimodal stochastic Transformer model. The model makes use of a variety of modalities, such as text, audio, and visual data, to gather comprehensive and varied information in order to make the detection.

摘要

背景与目的

创伤后应激障碍是一种使人虚弱的心理状况,可能在经历创伤事件后出现。它影响来自不同背景的个体,与多种症状相关,包括侵入性思维、噩梦、过度警觉和回避行为。

方法

为了解决这一挑战,本研究提出了一个决策支持系统,该系统由一种新颖的基于随机 Transformer 和视频数据的多模态深度学习方法提供支持。这种 Transformer 具有利用其随机激活函数和允许其学习输入稀疏表示的层的能力。该方法利用了三种模态提取的低水平特征的组合,包括从音频记录中提取的梅尔频率倒谱系数、从面部表情中捕获的面部动作单元以及从音频转录中获得的文本数据。通过考虑这些模态,我们提出的模型捕捉到了与创伤后应激障碍症状相关的广泛信息,包括声音线索、面部表情和语言内容。

结果

深度学习模型在 eDAIC 数据集上进行了训练和评估,该数据集由有和没有创伤后障碍的个体的临床访谈组成。该模型取得了最先进的结果,证明了其在准确检测 PTSD 方面的有效性,显示出令人印象深刻的均方根误差为 1.98,协调相关系数为 0.722,这表明与现有方法相比,该模型具有优越的性能。

结论

本工作通过利用多模态随机 Transformer 模型,提出了一种从视频中检测创伤后应激障碍的新方法。该模型利用多种模态,如文本、音频和视觉数据,以收集全面和多样化的信息来进行检测。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验