Suppr超能文献

基于语音的预训练模型进行抑郁识别。

Depression recognition using voice-based pre-training model.

机构信息

School of Biomedical Engineering, South-Central Minzu University, No.182, Minzu Avenue, Hongshan District, Wuhan City, 430074, Hubei Province, China.

出版信息

Sci Rep. 2024 Jun 3;14(1):12734. doi: 10.1038/s41598-024-63556-0.

Abstract

The early screening of depression is highly beneficial for patients to obtain better diagnosis and treatment. While the effectiveness of utilizing voice data for depression detection has been demonstrated, the issue of insufficient dataset size remains unresolved. Therefore, we propose an artificial intelligence method to effectively identify depression. The wav2vec 2.0 voice-based pre-training model was used as a feature extractor to automatically extract high-quality voice features from raw audio. Additionally, a small fine-tuning network was used as a classification model to output depression classification results. Subsequently, the proposed model was fine-tuned on the DAIC-WOZ dataset and achieved excellent classification results. Notably, the model demonstrated outstanding performance in binary classification, attaining an accuracy of 0.9649 and an RMSE of 0.1875 on the test set. Similarly, impressive results were obtained in multi-classification, with an accuracy of 0.9481 and an RMSE of 0.3810. The wav2vec 2.0 model was first used for depression recognition and showed strong generalization ability. The method is simple, practical, and applicable, which can assist doctors in the early screening of depression.

摘要

早期筛查抑郁对患者获得更好的诊断和治疗非常有益。虽然利用语音数据进行抑郁检测的有效性已经得到证明,但数据集规模不足的问题仍未得到解决。因此,我们提出了一种人工智能方法来有效识别抑郁。该方法使用 wav2vec 2.0 语音预训练模型作为特征提取器,从原始音频中自动提取高质量的语音特征。此外,还使用一个小型的微调网络作为分类模型,输出抑郁分类结果。随后,我们在 DAIC-WOZ 数据集上对所提出的模型进行了微调,并取得了优异的分类结果。值得注意的是,该模型在二进制分类中表现出色,在测试集上的准确率为 0.9649,RMSE 为 0.1875。在多类分类中也取得了令人印象深刻的结果,准确率为 0.9481,RMSE 为 0.3810。该模型首次被用于抑郁识别,表现出较强的泛化能力。该方法简单、实用、适用,可以辅助医生进行抑郁的早期筛查。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/452e/11637030/aed1c642470b/41598_2024_63556_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验