基于面部多模态数据的抑郁症诊断。

Diagnosis of depression based on facial multimodal data.

作者信息

Jin Nani, Ye Renjia, Li Peng

机构信息

Materdicine Lab, School of Life Sciences, Shanghai University, Shanghai, China.

Research Department, Third Xiangya Hospital of Central South University, Changsha, China.

出版信息

Front Psychiatry. 2025 Jan 28;16:1508772. doi: 10.3389/fpsyt.2025.1508772. eCollection 2025.

DOI:10.3389/fpsyt.2025.1508772

PMID:39935533

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11811426/

Abstract

INTRODUCTION

Depression is a serious mental health disease. Traditional scale-based depression diagnosis methods often have problems of strong subjectivity and high misdiagnosis rate, so it is particularly important to develop automatic diagnostic tools based on objective indicators.

METHODS

This study proposes a deep learning method that fuses multimodal data to automatically diagnose depression using facial video and audio data. We use spatiotemporal attention module to enhance the extraction of visual features and combine the Graph Convolutional Network (GCN) and the Long and Short Term Memory (LSTM) to analyze the audio features. Through the multi-modal feature fusion, the model can effectively capture different feature patterns related to depression.

RESULTS

We conduct extensive experiments on the publicly available clinical dataset, the Extended Distress Analysis Interview Corpus (E-DAIC). The experimental results show that we achieve robust accuracy on the E-DAIC dataset, with a Mean Absolute Error (MAE) of 3.51 in estimating PHQ-8 scores from recorded interviews.

DISCUSSION

Compared with existing methods, our model shows excellent performance in multi-modal information fusion, which is suitable for early evaluation of depression.

摘要

引言

抑郁症是一种严重的心理健康疾病。传统的基于量表的抑郁症诊断方法往往存在主观性强和误诊率高的问题，因此开发基于客观指标的自动诊断工具尤为重要。

方法

本研究提出了一种融合多模态数据的深度学习方法，利用面部视频和音频数据自动诊断抑郁症。我们使用时空注意力模块来增强视觉特征的提取，并结合图卷积网络（GCN）和长短时记忆网络（LSTM）来分析音频特征。通过多模态特征融合，该模型能够有效捕捉与抑郁症相关的不同特征模式。

结果

我们在公开可用的临床数据集——扩展痛苦分析访谈语料库（E-DAIC）上进行了广泛的实验。实验结果表明，我们在E-DAIC数据集上取得了稳健的准确率，从录制的访谈中估计PHQ-8分数时的平均绝对误差（MAE）为3.51。

讨论

与现有方法相比，我们的模型在多模态信息融合方面表现出优异的性能，适用于抑郁症的早期评估。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

基于面部多模态数据的抑郁症诊断。

Diagnosis of depression based on facial multimodal data.

作者信息

机构信息

出版信息

INTRODUCTION

METHODS

RESULTS

DISCUSSION

引言

方法

结果

讨论

相似文献

本文引用的文献

基于面部多模态数据的抑郁症诊断。

Diagnosis of depression based on facial multimodal data.

作者信息

机构信息

出版信息

INTRODUCTION

METHODS

RESULTS

DISCUSSION

引言

方法

结果

讨论

相似文献

本文引用的文献