Mazur Alexa, Costantino Harrison, Tom Prentice, Wilson Michael P, Thompson Ronald G
Kintsugi Mindful Wellness, Inc, San Francisco, California
Department of Computer Science, University of California, Berkeley, California.
Ann Fam Med. 2025 Jan 27;23(1):60-65. doi: 10.1370/afm.240091.
Mental health screening is recommended by the US Preventive Services Task Force for all patients in areas where treatment options are available. Still, it is estimated that only 4% of primary care patients are screened for depression. The goal of this study was to evaluate the efficacy of machine learning technology (Kintsugi Voice, v1, Kintsugi Mindful Wellness, Inc) to detect and analyze voice biomarkers consistent with moderate to severe depression, potentially allowing for greater compliance with this critical primary care public health need.
We performed a cross-sectional study from February 1, 2021 to July 31, 2022 to examine ≥25 seconds of free-form speech content from English-speaking samples captured from 14,898 unique adults in the United States and Canada. Participants were recruited via social media, provided informed consent, and their voice biomarker results were compared with a self-reported Patient Health Questionnaire-9 (PHQ-9) at a cut-off score of 10 (moderate to severe depression).
From as few as 25 seconds of free-form speech, machine learning technology was able to detect vocal characteristics consistent with an increased PHQ-9 ≥10, with a sensitivity of 71.3 (95% CI, 69.0-73.5) and a specificity of 73.5 (95% CI, 71.5-75.5).
Machine learning has potential utility in helping clinicians screen patients for moderate to severe depression. Further research is needed to measure the effectiveness of machine learning vocal detection and analysis technology in clinical deployment.
美国预防服务工作组建议,在有治疗选择的地区,对所有患者进行心理健康筛查。然而,据估计,只有4%的初级保健患者接受了抑郁症筛查。本研究的目的是评估机器学习技术(Kintsugi Voice,v1,Kintsugi Mindful Wellness公司)检测和分析与中度至重度抑郁症相关的语音生物标志物的功效,这可能有助于更好地满足这一关键的初级保健公共卫生需求。
我们于2021年2月1日至2022年7月31日进行了一项横断面研究,以检查从美国和加拿大14898名独特成年人中采集的英语样本中≥25秒的自由形式语音内容。通过社交媒体招募参与者,他们提供了知情同意书,并将其语音生物标志物结果与自我报告的患者健康问卷-9(PHQ-9)在临界值为10(中度至重度抑郁症)时进行比较。
仅从25秒的自由形式语音中,机器学习技术就能检测出与PHQ-9≥10升高相关的声音特征,灵敏度为71.3(95%CI,69.0-73.5),特异性为73.5(95%CI,71.5-75.5)。
机器学习在帮助临床医生筛查中度至重度抑郁症患者方面具有潜在效用。需要进一步研究来衡量机器学习语音检测和分析技术在临床应用中的有效性。