Agurto Carla, Cecchi Guillermo A, King Sarah, Eyigoz Elif K, Parvaz Muhammad A, Alia-Klein Nelly, Goldstein Rita Z
Thomas J. Watson Research Center, IBM, Yorktown Heights, New York.
Department of Psychiatry and Department of Neuroscience, Icahn School of Medicine at Mount Sinai, New York, New York.
Biol Psychiatry. 2025 Jul 1;98(1):65-75. doi: 10.1016/j.biopsych.2025.01.009. Epub 2025 Jan 20.
Valid scalable biomarkers for predicting longitudinal clinical outcomes in psychiatric research are crucial for optimizing intervention and prevention efforts. Here, we recorded spontaneous speech from initially abstinent individuals with cocaine use disorder (iCUDs) for use in predicting drug use outcomes.
At baseline, 88 iCUDs provided 5-minute speech samples describing the positive consequences of quitting drug use and negative consequences of using drugs. Outcomes, including withdrawal, craving, abstinence days, and recent cocaine use, were assessed at 3-month intervals for up to 1 year (57 iCUDs were included in the analyses). Predictive modeling compared natural language processing (NLP) techniques, specifically sentence embeddings with established inventories as targets, with models utilizing standard demographic and baseline psychometric variables.
At short time intervals, maximal predictive power was obtained with non-NLP models that also incorporated the same drug use measures (as the outcomes) obtained at baseline, potentially reflecting their slow rate of change, which could be estimated by linear functions. However, for longer-term predictions, speech samples alone demonstrated statistically significant results, with Spearman r ≥ 0.46 and 80% accuracy for predicting abstinence. Therefore, speech samples may capture nonlinear dynamics over extended intervals more effectively than traditional measures. These results need to be replicated in larger and independent samples.
Compared with the common outcome measures used in clinical trials, speech-based measures could be leveraged as better predictors of longitudinal drug use outcomes in initially abstinent iCUDs, as potentially generalizable to other subgroups with cocaine addiction, and to additional substance use disorders and related comorbidity.
在精神科研究中,用于预测纵向临床结果的有效且可扩展的生物标志物对于优化干预和预防工作至关重要。在此,我们记录了最初戒毒的可卡因使用障碍患者(iCUDs)的自发言语,用于预测药物使用结果。
在基线时,88名iCUDs提供了5分钟的言语样本,描述戒毒的积极后果和吸毒的消极后果。在长达1年的时间里,每隔3个月评估一次结果,包括戒断、渴望、戒毒天数和近期可卡因使用情况(分析纳入了57名iCUDs)。预测模型将自然语言处理(NLP)技术,特别是以既定量表为目标的句子嵌入,与利用标准人口统计学和基线心理测量变量的模型进行了比较。
在短时间间隔内,非NLP模型获得了最大预测能力,这些模型还纳入了在基线时获得的相同药物使用指标(作为结果),这可能反映了它们变化缓慢的速率,可用线性函数估计。然而,对于长期预测,仅言语样本就显示出具有统计学意义的结果,预测戒毒的斯皮尔曼相关系数r≥0.46,准确率达80%。因此,言语样本可能比传统指标更有效地捕捉较长时间间隔内的非线性动态变化。这些结果需要在更大的独立样本中进行重复验证。
与临床试验中常用的结果指标相比,基于言语的指标可作为最初戒毒的iCUDs纵向药物使用结果的更好预测指标,可能适用于其他可卡因成瘾亚组,以及其他物质使用障碍和相关合并症。