Suppr超能文献

痴呆高危人群的语音情感识别

Speech Emotion Recognition in People at High Risk of Dementia.

作者信息

Kim Dongseon, Yi Bongwon, Won Yugwon

机构信息

Department of Silver Business, Sookmyung Women's University, Seoul, Korea.

Department of Communication Disorders, Korea Nazarene University, Cheonan, Korea.

出版信息

Dement Neurocogn Disord. 2024 Jul;23(3):146-160. doi: 10.12779/dnd.2024.23.3.146. Epub 2024 Jul 24.

Abstract

BACKGROUND AND PURPOSE

The emotions of people at various stages of dementia need to be effectively utilized for prevention, early intervention, and care planning. With technology available for understanding and addressing the emotional needs of people, this study aims to develop speech emotion recognition (SER) technology to classify emotions for people at high risk of dementia.

METHODS

Speech samples from people at high risk of dementia were categorized into distinct emotions via human auditory assessment, the outcomes of which were annotated for guided deep-learning method. The architecture incorporated convolutional neural network, long short-term memory, attention layers, and Wav2Vec2, a novel feature extractor to develop automated speech-emotion recognition.

RESULTS

Twenty-seven kinds of Emotions were found in the speech of the participants. These emotions were grouped into 6 detailed emotions: happiness, interest, sadness, frustration, anger, and neutrality, and further into 3 basic emotions: positive, negative, and neutral. To improve algorithmic performance, multiple learning approaches were applied using different data sources-voice and text-and varying the number of emotions. Ultimately, a 2-stage algorithm-initial text-based classification followed by voice-based analysis-achieved the highest accuracy, reaching 70%.

CONCLUSIONS

The diverse emotions identified in this study were attributed to the characteristics of the participants and the method of data collection. The speech of people at high risk of dementia to companion robots also explains the relatively low performance of the SER algorithm. Accordingly, this study suggests the systematic and comprehensive construction of a dataset from people with dementia.

摘要

背景与目的

痴呆症各个阶段患者的情绪需要被有效利用,以进行预防、早期干预和护理规划。随着现有技术可用于理解和满足人们的情感需求,本研究旨在开发语音情感识别(SER)技术,对痴呆症高危人群的情绪进行分类。

方法

通过人工听觉评估将痴呆症高危人群的语音样本分类为不同的情绪,并对评估结果进行注释,以指导深度学习方法。该架构结合了卷积神经网络、长短期记忆、注意力层和新型特征提取器Wav2Vec2,以开发自动语音情感识别技术。

结果

在参与者的语音中发现了27种情绪。这些情绪被归为6种详细情绪:快乐、兴趣、悲伤、沮丧、愤怒和中性,进而又归为3种基本情绪:积极、消极和中性。为提高算法性能,使用不同数据源(语音和文本)并改变情绪数量,应用了多种学习方法。最终,一种两阶段算法——先基于文本进行初始分类,然后基于语音进行分析——实现了最高准确率,达到70%。

结论

本研究中识别出的多种情绪归因于参与者的特征和数据收集方法。痴呆症高危人群对陪伴机器人的语音也解释了SER算法性能相对较低的原因。因此,本研究建议从痴呆症患者中系统、全面地构建数据集。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7301/11300689/59c4dfafb667/dnd-23-146-g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验