Suppr超能文献

心理治疗中工作联盟的预测:一种多模态机器学习方法。

Predicting working alliance in psychotherapy: A multi-modal machine learning approach.

作者信息

Aafjes-van Doorn Katie, Cicconet Marcelo, Cohn Jeffrey F, Aafjes Marc

机构信息

Deliberate AI, New York, NY, United States.

Faculty of Arts and Sciences, New York University Shanghai, Shanghai, People's Republic of China.

出版信息

Psychother Res. 2025 Feb;35(2):256-270. doi: 10.1080/10503307.2024.2428702. Epub 2025 Jan 1.

Abstract

OBJECTIVE

Session-by-session tracking of the working alliance enables clinicians to detect alliance deterioration and intervene accordingly, which has shown to improve treatment outcome, and reduce dropout. Despite this, regular use of alliance self-report measures has failed to gain widespread implementation. We aimed to develop an automated alliance prediction using behavioral features obtained from video-recorded therapy sessions.

METHOD

A naturalistic dataset of session recordings with patient-ratings of working alliance was available for 252 in-person and teletherapy sessions from 47 patients treated by 10 clinicians. Text and audio-based features were extracted from all 252 sessions. Additional video-based feature extraction was possible for a subsample of 80 sessions. We developed a modeling pipeline for audio and text and for audio, text and video to train machine learning regression models that fuse multimodal features.

RESULTS

Best results were achieved with a Gradient Boosting architecture, when using audio, text, and video features extracted from the patient (ICC = 0.66, Pearson = 0.70, MAE = 0.33).

CONCLUSION

Automated alliance prediction from video-recorded therapy sessions is feasible with high accuracy. A data-driven multimodal approach to feature extraction and selection enables powerful models, outperforming previous work.

摘要

目的

逐节跟踪工作联盟能使临床医生发现联盟的恶化情况并据此进行干预,这已被证明可改善治疗效果并减少脱落率。尽管如此,联盟自我报告测量方法的常规使用仍未得到广泛应用。我们旨在利用从视频记录的治疗会话中获取的行为特征开发一种自动联盟预测方法。

方法

有一个自然主义的会话记录数据集,其中包含47名患者由10名临床医生进行的252次面对面和远程治疗会话中患者对工作联盟的评分。从所有252次会话中提取基于文本和音频的特征。对于80次会话的子样本,还可以进行基于视频的额外特征提取。我们为音频和文本以及音频、文本和视频开发了一个建模管道,以训练融合多模态特征的机器学习回归模型。

结果

当使用从患者身上提取的音频、文本和视频特征时,梯度提升架构取得了最佳结果(组内相关系数=0.66,皮尔逊相关系数=0.70,平均绝对误差=0.33)。

结论

从视频记录的治疗会话中进行自动联盟预测是可行的,且准确率很高。一种数据驱动的多模态特征提取和选择方法能够构建强大的模型,优于以往的工作。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验