文献检索文档翻译深度研究
Suppr Zotero 插件Zotero 插件
邀请有礼套餐&价格历史记录

新学期,新优惠

限时优惠:9月1日-9月22日

30天高级会员仅需29元

1天体验卡首发特惠仅需5.99元

了解详情
不再提醒
插件&应用
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
高级版
套餐订阅购买积分包
AI 工具
文献检索文档翻译深度研究
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2025

A Multimodal Approach for Detection and Assessment of Depression Using Text, Audio and Video.

作者信息

Zhang Wei, Mao Kaining, Chen Jie

机构信息

Department of Electrical and Computer Engineering, University of Alberta, Edmonton, AB T6G 2R3 Canada.

Academy of Engineering and Technology, Fudan University, Shanghai, 200433 China.

出版信息

Phenomics. 2024 May 3;4(3):234-249. doi: 10.1007/s43657-023-00152-8. eCollection 2024 Jun.


DOI:10.1007/s43657-023-00152-8
PMID:39398421
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11467147/
Abstract

UNLABELLED: Depression is one of the most common mental disorders, and rates of depression in individuals increase each year. Traditional diagnostic methods are primarily based on professional judgment, which is prone to individual bias. Therefore, it is crucial to design an effective and robust diagnostic method for automated depression detection. Current artificial intelligence approaches are limited in their abilities to extract features from long sentences. In addition, current models are not as robust with large input dimensions. To solve these concerns, a multimodal fusion model comprised of text, audio, and video for both depression detection and assessment tasks was developed. In the text modality, pre-trained sentence embedding was utilized to extract semantic representation along with Bidirectional long short-term memory (BiLSTM) to predict depression. This study also used Principal component analysis (PCA) to reduce the dimensionality of the input feature space and Support vector machine (SVM) to predict depression based on audio modality. In the video modality, Extreme gradient boosting (XGBoost) was employed to conduct both feature selection and depression detection. The final predictions were given by outputs of the different modalities with an ensemble voting algorithm. Experiments on the Distress analysis interview corpus wizard-of-Oz (DAIC-WOZ) dataset showed a great improvement of performance, with a weighted F1 score of 0.85, a Root mean square error (RMSE) of 5.57, and a Mean absolute error (MAE) of 4.48. Our proposed model outperforms the baseline in both depression detection and assessment tasks, and was shown to perform better than other existing state-of-the-art depression detection methods. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1007/s43657-023-00152-8.

摘要

相似文献

[1]
A Multimodal Approach for Detection and Assessment of Depression Using Text, Audio and Video.

Phenomics. 2024-5-3

[2]
End-to-end multimodal clinical depression recognition using deep neural networks: A comparative analysis.

Comput Methods Programs Biomed. 2021-11

[3]
Multimodal machine learning for language and speech markers identification in mental health.

BMC Med Inform Decis Mak. 2024-11-22

[4]
Diagnosis of depression based on facial multimodal data.

Front Psychiatry. 2025-1-28

[5]
DepITCM: an audio-visual method for detecting depression.

Front Psychiatry. 2025-1-23

[6]
Automatic depression severity assessment with deep learning using parameter-efficient tuning.

Front Psychiatry. 2023-6-15

[7]
Multi-Head Attention-Based Long Short-Term Memory for Depression Detection From Speech.

Front Neurorobot. 2021-8-26

[8]
Optimizing depression detection in clinical doctor-patient interviews using a multi-instance learning framework.

Sci Rep. 2025-2-24

[9]
A New Regression Model for Depression Severity Prediction Based on Correlation among Audio Features Using a Graph Convolutional Neural Network.

Diagnostics (Basel). 2023-2-14

[10]
Harnessing multimodal approaches for depression detection using large language models and facial expressions.

Npj Ment Health Res. 2024-12-23

引用本文的文献

[1]
SMF-net: semantic-guided multimodal fusion network for precise pancreatic tumor segmentation in medical CT image.

Front Oncol. 2025-7-18

[2]
A new metoposaurid (Temnospondyli) bonebed from the lower Popo Agie Formation (Carnian, Triassic) and an assessment of skeletal sorting.

PLoS One. 2025-4-2

本文引用的文献

[1]
Breast cancer diagnosis using the fast learning network algorithm.

Front Oncol. 2023-4-27

[2]
Ensemble Approach on Deep and Handcrafted Features for Neonatal Bowel Sound Detection.

IEEE J Biomed Health Inform. 2023-6

[3]
Particle Swarm Optimization-Based Extreme Learning Machine for COVID-19 Detection.

Cognit Comput. 2022-10-12

[4]
Multimodal model with text and drug embeddings for adverse drug reaction classification.

J Biomed Inform. 2022-11

[5]
Gray wolf optimization-extreme learning machine approach for diabetic retinopathy detection.

Front Public Health. 2022

[6]
Using mHealth Technologies to Promote Public Health and Well-Being in Urban Areas with Blue-Green Solutions.

Stud Health Technol Inform. 2022-6-29

[7]
Deep Learning-Based Methods for Sentiment Analysis on Nepali COVID-19-Related Tweets.

Comput Intell Neurosci. 2021

[8]
Classification of COVID-19 Chest CT Images Based on Ensemble Deep Learning.

J Healthc Eng. 2021

[9]
Vector representation based on a supervised codebook for Nepali documents classification.

PeerJ Comput Sci. 2021-3-3

[10]
Prospective testing of a neurophysiologic biomarker for treatment decisions in major depressive disorder: The PRISE-MD trial.

J Psychiatr Res. 2020-5

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

推荐工具

医学文档翻译智能文献检索