Suppr超能文献

使用心理健康热线咨询师声音识别同理心的机器学习方法:算法开发与验证

Machine Learning Approach to Identifying Empathy Using the Vocals of Mental Health Helpline Counselors: Algorithm Development and Validation.

作者信息

Sanjeewa Ruvini, Iyer Ravi, Apputhurai Pragalathan, Wickramasinghe Nilmini, Meyer Denny

机构信息

School of Health Sciences, Swinburne University of Technology, Hawthorn, PO Box 218, John Street, Melbourne, 3122, Australia, 61 422587030.

School of Computing, Engineering & Mathematical Sciences, La Trobe University, Melbourne, Australia.

出版信息

JMIR Form Res. 2025 Apr 16;9:e67835. doi: 10.2196/67835.

Abstract

BACKGROUND

This research study aimed to detect the vocal features immersed in empathic counselor speech using samples of calls to a mental health helpline service.

OBJECTIVE

This study aimed to produce an algorithm for the identification of empathy from these features, which could act as a training guide for counselors and conversational agents who need to transmit empathy in their vocals.

METHODS

Two annotators with a psychology background and English heritage provided empathy ratings for 57 calls involving female counselors, as well as multiple short call segments within each of these calls. These ratings were found to be well-correlated between the 2 raters in a sample of 6 common calls. Using vocal feature extraction from call segments and statistical variable selection methods, such as L1 penalized LASSO (Least Absolute Shrinkage and Selection Operator) and forward selection, a total of 14 significant vocal features were associated with empathic speech. Generalized additive mixed models (GAMM), binary logistics regression with splines, and random forest models were used to obtain an algorithm that differentiated between high- and low-empathy call segments.

RESULTS

The binary logistics regression model reported higher predictive accuracies of empathy (area under the curve [AUC]=0.617, 95% CI 0.613-0.622) compared to the GAMM (AUC=0.605, 95% CI 0.601-0.609) and the random forest model (AUC=0.600, 95% CI 0.595-0.604). This difference was statistically significant, as evidenced by the nonoverlapping 95% CIs obtained for AUC. The DeLong test further validated these results, showing a significant difference in the binary logistic model compared to the random forest (D=6.443, df=186283, P<.001) and GAMM (Z=5.846, P<.001). These findings confirm that the binary logistic regression model outperforms the other 2 models concerning predictive accuracy for empathy classification.

CONCLUSIONS

This study suggests that the identification of empathy from vocal features alone is challenging, and further research involving multimodal models (eg, models incorporating facial expression, words used, and vocal features) are encouraged for detecting empathy in the future. This study has several limitations, including a relatively small sample of calls and only 2 empathy raters. Future research should focus on accommodating multiple raters with varied backgrounds to explore these effects on perceptions of empathy. Additionally, considering counselor vocals from larger, more heterogeneous populations, including mixed-gender samples, will allow an exploration of the factors influencing the level of empathy projected in counselor voices more generally.

摘要

背景

本研究旨在利用心理健康热线服务的通话样本,检测共情咨询师言语中所蕴含的语音特征。

目的

本研究旨在生成一种从这些特征中识别共情的算法,该算法可作为咨询师和对话代理在语音中传递共情时的训练指南,这些咨询师和对话代理需要在语音中传递共情。

方法

两名具有心理学背景和英语文化背景的注释者对57个涉及女性咨询师的通话以及每个通话中的多个短通话片段进行了共情评分。在6个常见通话样本中,发现这两名评分者的评分具有良好的相关性。通过从通话片段中提取语音特征以及采用统计变量选择方法,如L1惩罚最小绝对收缩和选择算子(LASSO)和向前选择,共确定了14个与共情言语相关的重要语音特征。使用广义相加混合模型(GAMM)、带样条的二元逻辑回归和随机森林模型来获得一种能够区分高共情和低共情通话片段的算法。

结果

与GAMM(曲线下面积[AUC]=0.605,95%可信区间0.601 - 0.609)和随机森林模型(AUC=0.600,95%可信区间0.595 - 0.604)相比,二元逻辑回归模型在共情预测准确性方面表现更高(AUC=0.617,95%可信区间0.613 - 0.622)。这种差异具有统计学意义,AUC的95%可信区间不重叠证明了这一点。DeLong检验进一步验证了这些结果,显示二元逻辑模型与随机森林模型(D = 6.443,自由度 = 186283,P <.001)和GAMM(Z = 5.846,P <.001)相比存在显著差异。这些发现证实,在共情分类的预测准确性方面,二元逻辑回归模型优于其他两个模型。

结论

本研究表明,仅从语音特征识别共情具有挑战性,未来鼓励开展涉及多模态模型(例如,结合面部表情、用词和语音特征的模型)的进一步研究以检测共情。本研究存在若干局限性,包括通话样本相对较小且仅有两名共情评分者。未来的研究应集中于纳入背景各异的多名评分者,以探索这些因素对共情感知的影响。此外,考虑来自更大、更具异质性群体(包括不同性别样本)的咨询师语音,将更全面地探索影响咨询师语音中所投射的共情水平的因素。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7fd5/12017608/d4bbb6bb816e/formative-v9-e67835-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验