Miner Adam S, Haque Albert, Fries Jason A, Fleming Scott L, Wilfley Denise E, Terence Wilson G, Milstein Arnold, Jurafsky Dan, Arnow Bruce A, Stewart Agras W, Fei-Fei Li, Shah Nigam H
Department of Psychiatry and Behavioral Sciences, Stanford University, Stanford, CA USA.
Department of Health Research and Policy, Stanford University, CA, USA.
NPJ Digit Med. 2020 Jun 3;3:82. doi: 10.1038/s41746-020-0285-8. eCollection 2020.
Accurate transcription of audio recordings in psychotherapy would improve therapy effectiveness, clinician training, and safety monitoring. Although automatic speech recognition software is commercially available, its accuracy in mental health settings has not been well described. It is unclear which metrics and thresholds are appropriate for different clinical use cases, which may range from population descriptions to individual safety monitoring. Here we show that automatic speech recognition is feasible in psychotherapy, but further improvements in accuracy are needed before widespread use. Our HIPAA-compliant automatic speech recognition system demonstrated a transcription word error rate of 25%. For depression-related utterances, sensitivity was 80% and positive predictive value was 83%. For clinician-identified harm-related sentences, the word error rate was 34%. These results suggest that automatic speech recognition may support understanding of language patterns and subgroup variation in existing treatments but may not be ready for individual-level safety surveillance.
心理治疗中录音的准确转录将提高治疗效果、临床医生培训水平和安全监测能力。虽然自动语音识别软件已在市场上有售,但其在心理健康环境中的准确性尚未得到充分描述。目前尚不清楚哪些指标和阈值适用于不同的临床应用案例,这些案例可能涵盖从人群描述到个体安全监测等范围。我们在此表明,自动语音识别在心理治疗中是可行的,但在广泛应用之前,还需要进一步提高准确性。我们符合《健康保险流通与责任法案》(HIPAA)的自动语音识别系统的转录单词错误率为25%。对于与抑郁症相关的话语,敏感度为80%,阳性预测值为83%。对于临床医生识别出的与伤害相关的句子,单词错误率为34%。这些结果表明,自动语音识别可能有助于理解现有治疗中的语言模式和亚组差异,但可能还不适用于个体层面的安全监测。