Nasir Md, Baucom Brian Robert, Georgiou Panayiotis, Narayanan Shrikanth
Department of Electrical Engineering, University of Southern California, Los Angeles, United States of America.
Department of Psychology, University of Utah, Salt Lake City, Utah, United States of America.
PLoS One. 2017 Sep 21;12(9):e0185123. doi: 10.1371/journal.pone.0185123. eCollection 2017.
Automated assessment and prediction of marital outcome in couples therapy is a challenging task but promises to be a potentially useful tool for clinical psychologists. Computational approaches for inferring therapy outcomes using observable behavioral information obtained from conversations between spouses offer objective means for understanding relationship dynamics. In this work, we explore whether the acoustics of the spoken interactions of clinically distressed spouses provide information towards assessment of therapy outcomes. The therapy outcome prediction task in this work includes detecting whether there was a relationship improvement or not (posed as a binary classification) as well as discerning varying levels of improvement or decline in the relationship status (posed as a multiclass recognition task). We use each interlocutor's acoustic speech signal characteristics such as vocal intonation and intensity, both independently and in relation to one another, as cues for predicting the therapy outcome. We also compare prediction performance with one obtained via standardized behavioral codes characterizing the relationship dynamics provided by human experts as features for automated classification. Our experiments, using data from a longitudinal clinical study of couples in distressed relations, showed that predictions of relationship outcomes obtained directly from vocal acoustics are comparable or superior to those obtained using human-rated behavioral codes as prediction features. In addition, combining direct signal-derived features with manually coded behavioral features improved the prediction performance in most cases, indicating the complementarity of relevant information captured by humans and machine algorithms. Additionally, considering the vocal properties of the interlocutors in relation to one another, rather than in isolation, showed to be important for improving the automatic prediction. This finding supports the notion that behavioral outcome, like many other behavioral aspects, is closely related to the dynamics and mutual influence of the interlocutors during their interaction and their resulting behavioral patterns.
在夫妻治疗中对婚姻结局进行自动评估和预测是一项具有挑战性的任务,但有望成为临床心理学家的一个潜在有用工具。利用从配偶间对话中获取的可观察行为信息来推断治疗结果的计算方法,为理解关系动态提供了客观手段。在这项工作中,我们探讨临床苦恼配偶的言语互动声学特征是否能为治疗结果评估提供信息。这项工作中的治疗结果预测任务包括检测关系是否有改善(作为二元分类问题)以及辨别关系状态中不同程度的改善或下降(作为多类识别任务)。我们将每个对话者的声学语音信号特征(如语调、强度)单独以及相互关联地用作预测治疗结果的线索。我们还将预测性能与通过标准化行为编码获得的性能进行比较,这些行为编码由人类专家提供,用于表征关系动态,作为自动分类的特征。我们使用来自一项针对处于苦恼关系中的夫妻的纵向临床研究的数据进行实验,结果表明,直接从声音声学特征获得的关系结果预测与使用人工评定的行为编码作为预测特征获得的结果相当,甚至更优。此外,在大多数情况下,将直接从信号中提取的特征与人工编码的行为特征相结合可提高预测性能,这表明人类和机器算法所捕捉的相关信息具有互补性。另外,考虑对话者相互之间的声音特性,而非孤立地考虑,对于提高自动预测很重要。这一发现支持了这样一种观点,即行为结果与许多其他行为方面一样,与对话者在互动过程中的动态和相互影响以及由此产生的行为模式密切相关。