• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

痴呆高危人群的语音情感识别

Speech Emotion Recognition in People at High Risk of Dementia.

作者信息

Kim Dongseon, Yi Bongwon, Won Yugwon

机构信息

Department of Silver Business, Sookmyung Women's University, Seoul, Korea.

Department of Communication Disorders, Korea Nazarene University, Cheonan, Korea.

出版信息

Dement Neurocogn Disord. 2024 Jul;23(3):146-160. doi: 10.12779/dnd.2024.23.3.146. Epub 2024 Jul 24.

DOI:10.12779/dnd.2024.23.3.146
PMID:39113753
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11300689/
Abstract

BACKGROUND AND PURPOSE

The emotions of people at various stages of dementia need to be effectively utilized for prevention, early intervention, and care planning. With technology available for understanding and addressing the emotional needs of people, this study aims to develop speech emotion recognition (SER) technology to classify emotions for people at high risk of dementia.

METHODS

Speech samples from people at high risk of dementia were categorized into distinct emotions via human auditory assessment, the outcomes of which were annotated for guided deep-learning method. The architecture incorporated convolutional neural network, long short-term memory, attention layers, and Wav2Vec2, a novel feature extractor to develop automated speech-emotion recognition.

RESULTS

Twenty-seven kinds of Emotions were found in the speech of the participants. These emotions were grouped into 6 detailed emotions: happiness, interest, sadness, frustration, anger, and neutrality, and further into 3 basic emotions: positive, negative, and neutral. To improve algorithmic performance, multiple learning approaches were applied using different data sources-voice and text-and varying the number of emotions. Ultimately, a 2-stage algorithm-initial text-based classification followed by voice-based analysis-achieved the highest accuracy, reaching 70%.

CONCLUSIONS

The diverse emotions identified in this study were attributed to the characteristics of the participants and the method of data collection. The speech of people at high risk of dementia to companion robots also explains the relatively low performance of the SER algorithm. Accordingly, this study suggests the systematic and comprehensive construction of a dataset from people with dementia.

摘要

背景与目的

痴呆症各个阶段患者的情绪需要被有效利用,以进行预防、早期干预和护理规划。随着现有技术可用于理解和满足人们的情感需求,本研究旨在开发语音情感识别(SER)技术,对痴呆症高危人群的情绪进行分类。

方法

通过人工听觉评估将痴呆症高危人群的语音样本分类为不同的情绪,并对评估结果进行注释,以指导深度学习方法。该架构结合了卷积神经网络、长短期记忆、注意力层和新型特征提取器Wav2Vec2,以开发自动语音情感识别技术。

结果

在参与者的语音中发现了27种情绪。这些情绪被归为6种详细情绪:快乐、兴趣、悲伤、沮丧、愤怒和中性,进而又归为3种基本情绪:积极、消极和中性。为提高算法性能,使用不同数据源(语音和文本)并改变情绪数量,应用了多种学习方法。最终,一种两阶段算法——先基于文本进行初始分类,然后基于语音进行分析——实现了最高准确率,达到70%。

结论

本研究中识别出的多种情绪归因于参与者的特征和数据收集方法。痴呆症高危人群对陪伴机器人的语音也解释了SER算法性能相对较低的原因。因此,本研究建议从痴呆症患者中系统、全面地构建数据集。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7301/11300689/a87e2ce0dfc0/dnd-23-146-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7301/11300689/59c4dfafb667/dnd-23-146-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7301/11300689/ad4fca714057/dnd-23-146-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7301/11300689/d53cf8e076f1/dnd-23-146-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7301/11300689/a87e2ce0dfc0/dnd-23-146-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7301/11300689/59c4dfafb667/dnd-23-146-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7301/11300689/ad4fca714057/dnd-23-146-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7301/11300689/d53cf8e076f1/dnd-23-146-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7301/11300689/a87e2ce0dfc0/dnd-23-146-g004.jpg

相似文献

1
Speech Emotion Recognition in People at High Risk of Dementia.痴呆高危人群的语音情感识别
Dement Neurocogn Disord. 2024 Jul;23(3):146-160. doi: 10.12779/dnd.2024.23.3.146. Epub 2024 Jul 24.
2
Objectively Quantifying Pediatric Psychiatric Severity Using Artificial Intelligence, Voice Recognition Technology, and Universal Emotions: Pilot Study for Artificial Intelligence-Enabled Innovation to Address Youth Mental Health Crisis.利用人工智能、语音识别技术和通用情感客观量化儿科精神疾病严重程度:基于人工智能的创新解决青少年心理健康危机的试点研究
JMIR Res Protoc. 2023 Oct 23;12:e51912. doi: 10.2196/51912.
3
Emotional Speech Recognition Using Deep Neural Networks.使用深度神经网络进行情感语音识别。
Sensors (Basel). 2022 Feb 12;22(4):1414. doi: 10.3390/s22041414.
4
Speech Emotion Recognition Using Attention Model.基于注意力模型的语音情感识别
Int J Environ Res Public Health. 2023 Mar 14;20(6):5140. doi: 10.3390/ijerph20065140.
5
Deep-Net: A Lightweight CNN-Based Speech Emotion Recognition System Using Deep Frequency Features.深度网络:基于深度学习频率特征的轻量级 CNN 语音情感识别系统
Sensors (Basel). 2020 Sep 12;20(18):5212. doi: 10.3390/s20185212.
6
A Hybrid Time-Distributed Deep Neural Architecture for Speech Emotion Recognition.一种用于语音情感识别的混合时间分布深度神经架构。
Int J Neural Syst. 2022 Jun;32(6):2250024. doi: 10.1142/S0129065722500241. Epub 2022 May 12.
7
Detecting Clinically Relevant Emotional Distress and Functional Impairment in Children and Adolescents: Protocol for an Automated Speech Analysis Algorithm Development Study.检测儿童和青少年临床上相关的情绪困扰和功能损害:自动语音分析算法开发研究方案
JMIR Res Protoc. 2023 Jun 23;12:e46970. doi: 10.2196/46970.
8
Impact of Feature Selection Algorithm on Speech Emotion Recognition Using Deep Convolutional Neural Network.基于深度卷积神经网络的特征选择算法对语音情感识别的影响。
Sensors (Basel). 2020 Oct 23;20(21):6008. doi: 10.3390/s20216008.
9
Cascaded Convolutional Neural Network Architecture for Speech Emotion Recognition in Noisy Conditions.用于噪声环境下语音情感识别的级联卷积神经网络架构
Sensors (Basel). 2021 Jun 27;21(13):4399. doi: 10.3390/s21134399.
10
EEG-based emotion charting for Parkinson's disease patients using Convolutional Recurrent Neural Networks and cross dataset learning.基于 EEG 的帕金森病患者情绪图表分析,使用卷积循环神经网络和跨数据集学习。
Comput Biol Med. 2022 May;144:105327. doi: 10.1016/j.compbiomed.2022.105327. Epub 2022 Mar 11.

引用本文的文献

1
Emotional speech markers of psychiatric disturbance in Huntington's disease.亨廷顿舞蹈症精神障碍的情感言语标记
Front Psychiatry. 2025 Aug 12;16:1633492. doi: 10.3389/fpsyt.2025.1633492. eCollection 2025.

本文引用的文献

1
Harnessing the Power of Voice: A Deep Neural Network Model for Alzheimer's Disease Detection.利用语音的力量:一种用于阿尔茨海默病检测的深度神经网络模型。
Dement Neurocogn Disord. 2024 Jan;23(1):1-10. doi: 10.12779/dnd.2024.23.1.1. Epub 2024 Jan 22.
2
Sentiment Analysis: Comprehensive Reviews, Recent Advances, and Open Challenges.情感分析:全面综述、最新进展与开放挑战。
IEEE Trans Neural Netw Learn Syst. 2024 Nov;35(11):15092-15112. doi: 10.1109/TNNLS.2023.3294810. Epub 2024 Oct 29.
3
Artificial intelligence in disease diagnosis: a systematic literature review, synthesizing framework and future research agenda.
疾病诊断中的人工智能:系统文献综述、综合框架及未来研究议程
J Ambient Intell Humaniz Comput. 2023;14(7):8459-8486. doi: 10.1007/s12652-021-03612-z. Epub 2022 Jan 13.
4
Detection of dementia on voice recordings using deep learning: a Framingham Heart Study.使用深度学习检测语音记录中的痴呆症:弗雷明汉心脏研究。
Alzheimers Res Ther. 2021 Aug 31;13(1):146. doi: 10.1186/s13195-021-00888-3.
5
Dementia prevention, intervention, and care: 2020 report of the Lancet Commission.《痴呆症的预防、干预与照护:柳叶刀委员会2020年报告》
Lancet. 2020 Aug 8;396(10248):413-446. doi: 10.1016/S0140-6736(20)30367-6. Epub 2020 Jul 30.
6
Effectiveness of a Voice-Based Mental Health Evaluation System for Mobile Devices: Prospective Study.基于语音的移动设备心理健康评估系统的有效性:前瞻性研究。
JMIR Form Res. 2020 Jul 20;4(7):e16455. doi: 10.2196/16455.
7
Association Between Subjective Cognitive Decline and Social and Emotional Support in US Adults.美国成年人主观认知衰退与社会和情感支持之间的关联
Am J Alzheimers Dis Other Demen. 2020 Jan-Dec;35:1533317520922392. doi: 10.1177/1533317520922392.
8
Combined neuropathological pathways account for age-related risk of dementia.联合神经病理学途径解释了与年龄相关的痴呆风险。
Ann Neurol. 2018 Jul;84(1):10-22. doi: 10.1002/ana.25246. Epub 2018 Jun 26.
9
The Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS): A dynamic, multimodal set of facial and vocal expressions in North American English.瑞尔森情感语音和歌曲音频视频数据库(RAVDESS):一组具有北美英语特色的动态、多模态面部和声音表情数据集。
PLoS One. 2018 May 16;13(5):e0196391. doi: 10.1371/journal.pone.0196391. eCollection 2018.
10
Psychological well-being and risk of dementia.心理健康与痴呆症风险
Int J Geriatr Psychiatry. 2018 May;33(5):743-747. doi: 10.1002/gps.4849. Epub 2018 Jan 3.