• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过听觉语音识别精神分裂症:基于深度学习的情感与特征融合语音判别分析

Hearing vocals to recognize schizophrenia: speech discriminant analysis with fusion of emotions and features based on deep learning.

作者信息

Huang Jie, Zhao Yanli, Tian Zhanxiao, Qu Wei, Du Xia, Zhang Jie, Zhang Meng, Tan Yunlong, Wang Zhiren, Tan Shuping

机构信息

Beijing HuiLongGuan Hospital, Peking University HuiLongGuan Clinical Medical School, Changping District, Beijing, 100096, China.

出版信息

BMC Psychiatry. 2025 May 8;25(1):466. doi: 10.1186/s12888-025-06888-z.

DOI:10.1186/s12888-025-06888-z
PMID:40340671
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12060412/
Abstract

BACKGROUND AND OBJECTIVE

Accurate detection of schizophrenia poses a grand challenge as a complex and heterogeneous mental disorder. Current diagnostic criteria rely primarily on clinical symptoms, which may not fully capture individual differences and the heterogeneity of the disorder. In this study, a discriminative model of schizophrenic speech based on deep learning is developed, which combines different emotional stimuli and features.

METHODS

A total of 156 schizophrenia patients and 74 healthy controls participated in the study, reading three fixed texts with varying emotional stimuli. The log-Mel spectrogram and Mel-frequency cepstral coefficients (MFCCs) were extracted using the librosa-0.9.2 toolkit. Convolutional neural networks were applied to analyze the log-Mel spectrogram. The effects of different emotional stimuli and the fusion of demographic information and MFCCs on schizophrenia detection were examined.

RESULTS

The discriminant analysis results showed superior performance for neutral emotional stimuli compared to positive and negative stimuli. Integrating different emotional stimuli and fusing features with personal information improved sensitivity and specificity. The best discriminant model achieved an accuracy of 91.7%, sensitivity of 94.9%, specificity of 85.1%, and ROC-AUC of 0.963.

CONCLUSIONS

Speech analysis under neutral emotional stimulation demonstrated greater differences between schizophrenia patients and healthy controls, enhancing discriminative analysis of schizophrenia. Integrating different emotions, demographic information and MFCCs improved the accuracy of schizophrenia detection. This study provides a methodological foundation for constructing a personalized speech detection model for schizophrenia.

摘要

背景与目的

精神分裂症作为一种复杂的异质性精神障碍,准确检测面临巨大挑战。目前的诊断标准主要依赖临床症状,可能无法充分体现个体差异和该疾病的异质性。本研究基于深度学习开发了一种精神分裂症语音判别模型,该模型结合了不同的情感刺激和特征。

方法

共有156名精神分裂症患者和74名健康对照参与研究,阅读三篇带有不同情感刺激的固定文本。使用librosa - 0.9.2工具包提取对数梅尔频谱图和梅尔频率倒谱系数(MFCC)。应用卷积神经网络分析对数梅尔频谱图。研究了不同情感刺激以及人口统计学信息与MFCC融合对精神分裂症检测的影响。

结果

判别分析结果显示,与积极和消极刺激相比,中性情感刺激的表现更优。整合不同情感刺激并将特征与个人信息融合可提高敏感性和特异性。最佳判别模型的准确率为91.7%,敏感性为94.9%,特异性为85.1%,ROC - AUC为0.963。

结论

中性情感刺激下的语音分析显示精神分裂症患者与健康对照之间存在更大差异,增强了对精神分裂症的判别分析。整合不同情感、人口统计学信息和MFCC提高了精神分裂症检测的准确性。本研究为构建精神分裂症个性化语音检测模型提供了方法学基础。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8e55/12060412/9ea3eaa619bf/12888_2025_6888_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8e55/12060412/13aed2081720/12888_2025_6888_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8e55/12060412/f76e55e32732/12888_2025_6888_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8e55/12060412/5f7bbe7a03c7/12888_2025_6888_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8e55/12060412/2f72054f00dc/12888_2025_6888_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8e55/12060412/9ea3eaa619bf/12888_2025_6888_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8e55/12060412/13aed2081720/12888_2025_6888_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8e55/12060412/f76e55e32732/12888_2025_6888_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8e55/12060412/5f7bbe7a03c7/12888_2025_6888_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8e55/12060412/2f72054f00dc/12888_2025_6888_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8e55/12060412/9ea3eaa619bf/12888_2025_6888_Fig5_HTML.jpg

相似文献

1
Hearing vocals to recognize schizophrenia: speech discriminant analysis with fusion of emotions and features based on deep learning.通过听觉语音识别精神分裂症:基于深度学习的情感与特征融合语音判别分析
BMC Psychiatry. 2025 May 8;25(1):466. doi: 10.1186/s12888-025-06888-z.
2
Deep learning in automatic detection of dysphonia: Comparing acoustic features and developing a generalizable framework.深度学习在嗓音障碍自动检测中的应用:比较声学特征并开发一个可推广的框架。
Int J Lang Commun Disord. 2023 Mar;58(2):279-294. doi: 10.1111/1460-6984.12783. Epub 2022 Sep 18.
3
Evaluating the clinical utility of speech analysis and machine learning in schizophrenia: A pilot study.评估言语分析和机器学习在精神分裂症中的临床效用:一项初步研究。
Comput Biol Med. 2023 Sep;164:107359. doi: 10.1016/j.compbiomed.2023.107359. Epub 2023 Aug 13.
4
Emotional stimulated speech-based assisted early diagnosis of depressive disorders using personality-enhanced deep learning.基于情绪刺激语音,利用人格增强深度学习辅助抑郁症早期诊断。
J Affect Disord. 2025 May 1;376:177-188. doi: 10.1016/j.jad.2025.01.136. Epub 2025 Feb 4.
5
Detection of Emotion of Speech for RAVDESS Audio Using Hybrid Convolution Neural Network.使用混合卷积神经网络检测 RAVDESS 音频的语音情感。
J Healthc Eng. 2022 Feb 27;2022:8472947. doi: 10.1155/2022/8472947. eCollection 2022.
6
Sch-net: a deep learning architecture for automatic detection of schizophrenia.Sch-net:一种用于自动检测精神分裂症的深度学习架构。
Biomed Eng Online. 2021 Aug 3;20(1):75. doi: 10.1186/s12938-021-00915-2.
7
Detecting emotional valence using time-domain analysis of speech signals.使用语音信号的时域分析来检测情感效价。
Annu Int Conf IEEE Eng Med Biol Soc. 2019 Jul;2019:3605-3608. doi: 10.1109/EMBC.2019.8857691.
8
A Narrative Review of Speech and EEG Features for Schizophrenia Detection: Progress and Challenges.精神分裂症检测中言语和脑电图特征的叙述性综述:进展与挑战
Bioengineering (Basel). 2023 Apr 20;10(4):493. doi: 10.3390/bioengineering10040493.
9
Speech Emotion Recognition Using Attention Model.基于注意力模型的语音情感识别
Int J Environ Res Public Health. 2023 Mar 14;20(6):5140. doi: 10.3390/ijerph20065140.
10
A Hybrid Time-Distributed Deep Neural Architecture for Speech Emotion Recognition.一种用于语音情感识别的混合时间分布深度神经架构。
Int J Neural Syst. 2022 Jun;32(6):2250024. doi: 10.1142/S0129065722500241. Epub 2022 May 12.

引用本文的文献

1
Taking a look at your speech: identifying diagnostic status and negative symptoms of psychosis using convolutional neural networks.审视你的言语:使用卷积神经网络识别精神病的诊断状态和阴性症状。
NPP Digit Psychiatry Neurosci. 2025;3(1):19. doi: 10.1038/s44277-025-00040-1. Epub 2025 Jul 8.

本文引用的文献

1
Schizophrenia: a Narrative Review of Etiological and Diagnostic Issues.精神分裂症:病因学与诊断问题的叙述性综述
Consort Psychiatr. 2022 Sep 30;3(3):19-34. doi: 10.17816/CP132. eCollection 2022.
2
Towards interpretable speech biomarkers: exploring MFCCs.迈向可解释的言语生物标志物研究:探索梅尔频率倒谱系数。
Sci Rep. 2023 Dec 21;13(1):22787. doi: 10.1038/s41598-023-49352-2.
3
Linguistic findings in persons with schizophrenia-a review of the current literature.精神分裂症患者的语言研究结果——当前文献综述
Front Psychol. 2023 Nov 21;14:1287706. doi: 10.3389/fpsyg.2023.1287706. eCollection 2023.
4
Relative importance of speech and voice features in the classification of schizophrenia and depression.言语和嗓音特征在精神分裂症和抑郁症分类中的相对重要性。
Transl Psychiatry. 2023 Sep 19;13(1):298. doi: 10.1038/s41398-023-02594-0.
5
MFCC Parameters of the Speech Signal: An Alternative to Formant-Based Instantaneous Vocal Tract Length Estimation.语音信号的MFCC参数:基于共振峰的瞬时声道长度估计的替代方法
J Voice. 2023 Jun 19. doi: 10.1016/j.jvoice.2023.05.012.
6
Voice Patterns as Markers of Schizophrenia: Building a Cumulative Generalizable Approach Via a Cross-Linguistic and Meta-analysis Based Investigation.语音模式作为精神分裂症的标志物:通过跨语言和基于元分析的研究建立一个可累积的通用方法。
Schizophr Bull. 2023 Mar 22;49(Suppl_2):S125-S141. doi: 10.1093/schbul/sbac128.
7
Hearing the physical condition: The relationship between sexually dimorphic vocal traits and underlying physiology.听觉身体状况:两性异形嗓音特征与潜在生理学之间的关系。
Front Psychol. 2022 Nov 3;13:983688. doi: 10.3389/fpsyg.2022.983688. eCollection 2022.
8
Semantic and Acoustic Markers in Schizophrenia-Spectrum Disorders: A Combinatory Machine Learning Approach.精神分裂症谱系障碍的语义和声学标记:组合机器学习方法。
Schizophr Bull. 2023 Mar 22;49(Suppl_2):S163-S171. doi: 10.1093/schbul/sbac142.
9
Automatic Schizophrenia Detection Using Multimodality Media via a Text Reading Task.通过文本阅读任务利用多模态媒体自动检测精神分裂症
Front Neurosci. 2022 Jul 14;16:933049. doi: 10.3389/fnins.2022.933049. eCollection 2022.
10
Automatic language analysis identifies and predicts schizophrenia in first-episode of psychosis.自动语言分析可识别并预测首次发作精神病中的精神分裂症。
Schizophrenia (Heidelb). 2022 Jun 1;8(1):53. doi: 10.1038/s41537-022-00259-3.