• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用集成卷积神经网络自动检测语音中的抑郁症

Automatic Detection of Depression in Speech Using Ensemble Convolutional Neural Networks.

作者信息

Vázquez-Romero Adrián, Gallardo-Antolín Ascensión

机构信息

Department of Signal Theory and Communications, Universidad Carlos III de Madrid, Avda. de la Universidad, 30, Leganés, 28911 Madrid, Spain.

出版信息

Entropy (Basel). 2020 Jun 20;22(6):688. doi: 10.3390/e22060688.

DOI:10.3390/e22060688
PMID:33286460
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7517226/
Abstract

This paper proposes a speech-based method for automatic depression classification. The system is based on ensemble learning for Convolutional Neural Networks (CNNs) and is evaluated using the data and the experimental protocol provided in the Depression Classification Sub-Challenge (DCC) at the 2016 Audio-Visual Emotion Challenge (AVEC-2016). In the pre-processing phase, speech files are represented as a sequence of log-spectrograms and randomly sampled to balance positive and negative samples. For the classification task itself, first, a more suitable architecture for this task, based on One-Dimensional Convolutional Neural Networks, is built. Secondly, several of these CNN-based models are trained with different initializations and then the corresponding individual predictions are fused by using an Ensemble Averaging algorithm and combined per speaker to get an appropriate final decision. The proposed ensemble system achieves satisfactory results on the DCC at the AVEC-2016 in comparison with a reference system based on Support Vector Machines and hand-crafted features, with a CNN+LSTM-based system called DepAudionet, and with the case of a single CNN-based classifier.

摘要

本文提出了一种基于语音的自动抑郁症分类方法。该系统基于卷积神经网络(CNN)的集成学习,并使用2016年视听情感挑战赛(AVEC - 2016)抑郁症分类子挑战赛(DCC)中提供的数据和实验协议进行评估。在预处理阶段,语音文件被表示为对数频谱图序列,并进行随机采样以平衡正样本和负样本。对于分类任务本身,首先,基于一维卷积神经网络构建了一个更适合此任务的架构。其次,使用不同的初始化对多个基于CNN的模型进行训练,然后通过集成平均算法融合相应的个体预测,并按每个说话者进行组合以获得合适的最终决策。与基于支持向量机和手工特征的参考系统、一个名为DepAudionet的基于CNN + LSTM的系统以及单个基于CNN的分类器的情况相比,所提出的集成系统在AVEC - 2016的DCC上取得了令人满意的结果。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e79c/7517226/2e6145f0f357/entropy-22-00688-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e79c/7517226/ac465044c55c/entropy-22-00688-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e79c/7517226/0d0366f1aaff/entropy-22-00688-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e79c/7517226/094e7a991344/entropy-22-00688-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e79c/7517226/6700cdf266c4/entropy-22-00688-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e79c/7517226/b3aadc1fd33c/entropy-22-00688-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e79c/7517226/5b135942855d/entropy-22-00688-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e79c/7517226/2e6145f0f357/entropy-22-00688-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e79c/7517226/ac465044c55c/entropy-22-00688-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e79c/7517226/0d0366f1aaff/entropy-22-00688-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e79c/7517226/094e7a991344/entropy-22-00688-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e79c/7517226/6700cdf266c4/entropy-22-00688-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e79c/7517226/b3aadc1fd33c/entropy-22-00688-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e79c/7517226/5b135942855d/entropy-22-00688-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e79c/7517226/2e6145f0f357/entropy-22-00688-g007.jpg

相似文献

1
Automatic Detection of Depression in Speech Using Ensemble Convolutional Neural Networks.使用集成卷积神经网络自动检测语音中的抑郁症
Entropy (Basel). 2020 Jun 20;22(6):688. doi: 10.3390/e22060688.
2
Ensemble learning with speaker embeddings in multiple speech task stimuli for depression detection.在用于抑郁症检测的多语音任务刺激中结合说话人嵌入的集成学习。
Front Neurosci. 2023 Mar 23;17:1141621. doi: 10.3389/fnins.2023.1141621. eCollection 2023.
3
A CAD system for pulmonary nodule prediction based on deep three-dimensional convolutional neural networks and ensemble learning.基于深度三维卷积神经网络和集成学习的肺结节预测 CAD 系统。
PLoS One. 2019 Jul 12;14(7):e0219369. doi: 10.1371/journal.pone.0219369. eCollection 2019.
4
Lipreading Architecture Based on Multiple Convolutional Neural Networks for Sentence-Level Visual Speech Recognition.基于多个卷积神经网络的句子级唇读识别架构。
Sensors (Basel). 2021 Dec 23;22(1):72. doi: 10.3390/s22010072.
5
Automatic Depression Detection Using Smartphone-Based Text-Dependent Speech Signals: Deep Convolutional Neural Network Approach.使用基于智能手机的文本相关语音信号进行自动抑郁检测:深度学习卷积神经网络方法。
J Med Internet Res. 2023 Jan 25;25:e34474. doi: 10.2196/34474.
6
Neuronetwork Approach in the Early Diagnosis of Depression.神经网络方法在抑郁症早期诊断中的应用。
Psychiatr Danub. 2023 Oct;35(Suppl 2):77-85.
7
3D CNN-Based Speech Emotion Recognition Using K-Means Clustering and Spectrograms.基于3D卷积神经网络的语音情感识别:使用K均值聚类和频谱图
Entropy (Basel). 2019 May 8;21(5):479. doi: 10.3390/e21050479.
8
A Hybrid Time-Distributed Deep Neural Architecture for Speech Emotion Recognition.一种用于语音情感识别的混合时间分布深度神经架构。
Int J Neural Syst. 2022 Jun;32(6):2250024. doi: 10.1142/S0129065722500241. Epub 2022 May 12.
9
Impact of Feature Selection Algorithm on Speech Emotion Recognition Using Deep Convolutional Neural Network.基于深度卷积神经网络的特征选择算法对语音情感识别的影响。
Sensors (Basel). 2020 Oct 23;20(21):6008. doi: 10.3390/s20216008.
10
End-to-end multimodal clinical depression recognition using deep neural networks: A comparative analysis.端到端使用深度神经网络进行多模态临床抑郁症识别:比较分析。
Comput Methods Programs Biomed. 2021 Nov;211:106433. doi: 10.1016/j.cmpb.2021.106433. Epub 2021 Sep 28.

引用本文的文献

1
AI-assisted multi-modal information for the screening of depression: a systematic review and meta-analysis.人工智能辅助多模态信息用于抑郁症筛查:一项系统综述和荟萃分析。
NPJ Digit Med. 2025 Aug 16;8(1):523. doi: 10.1038/s41746-025-01933-3.
2
Detecting depression in speech using verbal behavior analysis: a cross-cultural study.运用言语行为分析检测言语中的抑郁:一项跨文化研究。
Front Psychol. 2025 May 29;16:1514918. doi: 10.3389/fpsyg.2025.1514918. eCollection 2025.
3
Validation of Machine Learning-Based Assessment of Major Depressive Disorder from Paralinguistic Speech Characteristics in Routine Care.

本文引用的文献

1
INFERRING CLINICAL DEPRESSION FROM SPEECH AND SPOKEN UTTERANCES.从语音和话语中推断临床抑郁症
IEEE Int Workshop Mach Learn Signal Process. 2014 Sep;2014. doi: 10.1109/mlsp.2014.6958856. Epub 2014 Nov 20.
2
3D CNN-Based Speech Emotion Recognition Using K-Means Clustering and Spectrograms.基于3D卷积神经网络的语音情感识别:使用K均值聚类和频谱图
Entropy (Basel). 2019 May 8;21(5):479. doi: 10.3390/e21050479.
3
Automated assessment of psychiatric disorders using speech: A systematic review.使用语音对精神疾病进行自动评估:一项系统综述。
基于日常护理中副语言语音特征的机器学习重度抑郁症评估方法的验证
Depress Anxiety. 2024 Apr 9;2024:9667377. doi: 10.1155/2024/9667377. eCollection 2024.
4
Enhancing depression recognition through a mixed expert model by integrating speaker-related and emotion-related features.通过整合与说话者相关和与情感相关的特征,利用混合专家模型提高抑郁症识别能力。
Sci Rep. 2025 Feb 3;15(1):4064. doi: 10.1038/s41598-025-88313-9.
5
DPD (DePression Detection) Net: a deep neural network for multimodal depression detection.DPD(抑郁检测)网络:一种用于多模态抑郁检测的深度神经网络。
Health Inf Sci Syst. 2024 Nov 12;12(1):53. doi: 10.1007/s13755-024-00311-9. eCollection 2024 Dec.
6
Resting-State Electroencephalogram Depression Diagnosis Based on Traditional Machine Learning and Deep Learning: A Comparative Analysis.基于传统机器学习和深度学习的静息态脑电图抑郁诊断:对比分析。
Sensors (Basel). 2024 Oct 23;24(21):6815. doi: 10.3390/s24216815.
7
Depression detection with machine learning of structural and non-structural dual languages.基于结构和非结构双语的机器学习进行抑郁症检测。
Healthc Technol Lett. 2024 Jun 10;11(4):218-226. doi: 10.1049/htl2.12088. eCollection 2024 Aug.
8
Non-uniform Speaker Disentanglement For Depression Detection From Raw Speech Signals.基于原始语音信号的抑郁症检测的非均匀说话人解缠
Interspeech. 2023 Aug;2023:2343-2347. doi: 10.21437/interspeech.2023-2101.
9
Classification of emotional stress and physical stress using a multispectral based deep feature extraction model.基于多光谱的深度特征提取模型对情绪压力和生理压力的分类。
Sci Rep. 2023 Feb 15;13(1):2693. doi: 10.1038/s41598-023-29903-3.
10
Artificial intelligence assisted tools for the detection of anxiety and depression leading to suicidal ideation in adolescents: a review.人工智能辅助工具用于检测青少年中导致自杀意念的焦虑和抑郁:一项综述。
Cogn Neurodyn. 2022 Nov 22;18(1):1-22. doi: 10.1007/s11571-022-09904-0.
Laryngoscope Investig Otolaryngol. 2020 Jan 31;5(1):96-116. doi: 10.1002/lio2.354. eCollection 2020 Feb.
4
Deep learning-based automated speech detection as a marker of social functioning in late-life depression.基于深度学习的自动语音检测作为老年期抑郁症社会功能的标志物。
Psychol Med. 2021 Jul;51(9):1441-1450. doi: 10.1017/S0033291719003994. Epub 2020 Jan 16.
5
Speech analysis for health: Current state-of-the-art and the increasing impact of deep learning.语音分析在健康领域的应用:当前的最新技术和深度学习的影响日益增大。
Methods. 2018 Dec 1;151:41-54. doi: 10.1016/j.ymeth.2018.07.007. Epub 2018 Aug 10.
6
Epidemiology of Suicide and the Psychiatric Perspective.自杀的流行病学和精神科视角。
Int J Environ Res Public Health. 2018 Jul 6;15(7):1425. doi: 10.3390/ijerph15071425.
7
Detection of Pathological Voice Using Cepstrum Vectors: A Deep Learning Approach.基于倒谱向量的病理性嗓音检测:深度学习方法。
J Voice. 2019 Sep;33(5):634-641. doi: 10.1016/j.jvoice.2018.02.003. Epub 2018 Mar 19.
8
Advances on Automatic Speech Analysis for Early Detection of Alzheimer Disease: A Non-linear Multi-task Approach.用于早期检测阿尔茨海默病的自动语音分析进展:一种非线性多任务方法。
Curr Alzheimer Res. 2018;15(2):139-148. doi: 10.2174/1567205014666171120143800.
9
An Ensemble of Fine-Tuned Convolutional Neural Networks for Medical Image Classification.用于医学图像分类的微调卷积神经网络集成
IEEE J Biomed Health Inform. 2017 Jan;21(1):31-40. doi: 10.1109/JBHI.2016.2635663. Epub 2016 Dec 5.
10
Voice analysis as an objective state marker in bipolar disorder.语音分析作为双相情感障碍的一种客观状态标志物。
Transl Psychiatry. 2016 Jul 19;6(7):e856. doi: 10.1038/tp.2016.123.