Department of Psychological Medicine, Faculty of Medical and Health Sciences, University of Auckland, Auckland, New Zealand.
School of Pharmacy, Faculty of Medical and Health Sciences, University of Auckland, Auckland, New Zealand.
JMIR Mhealth Uhealth. 2021 Sep 17;9(9):e24352. doi: 10.2196/24352.
Mood disorders are commonly underrecognized and undertreated, as diagnosis is reliant on self-reporting and clinical assessments that are often not timely. Speech characteristics of those with mood disorders differs from healthy individuals. With the wide use of smartphones, and the emergence of machine learning approaches, smartphones can be used to monitor speech patterns to help the diagnosis and monitoring of mood disorders.
The aim of this review is to synthesize research on using speech patterns from smartphones to diagnose and monitor mood disorders.
Literature searches of major databases, Medline, PsycInfo, EMBASE, and CINAHL, initially identified 832 relevant articles using the search terms "mood disorders", "smartphone", "voice analysis", and their variants. Only 13 studies met inclusion criteria: use of a smartphone for capturing voice data, focus on diagnosing or monitoring a mood disorder(s), clinical populations recruited prospectively, and in the English language only. Articles were assessed by 2 reviewers, and data extracted included data type, classifiers used, methods of capture, and study results. Studies were analyzed using a narrative synthesis approach.
Studies showed that voice data alone had reasonable accuracy in predicting mood states and mood fluctuations based on objectively monitored speech patterns. While a fusion of different sensor modalities revealed the highest accuracy (97.4%), nearly 80% of included studies were pilot trials or feasibility studies without control groups and had small sample sizes ranging from 1 to 73 participants. Studies were also carried out over short or varying timeframes and had significant heterogeneity of methods in terms of the types of audio data captured, environmental contexts, classifiers, and measures to control for privacy and ambient noise.
Approaches that allow smartphone-based monitoring of speech patterns in mood disorders are rapidly growing. The current body of evidence supports the value of speech patterns to monitor, classify, and predict mood states in real time. However, many challenges remain around the robustness, cost-effectiveness, and acceptability of such an approach and further work is required to build on current research and reduce heterogeneity of methodologies as well as clinical evaluation of the benefits and risks of such approaches.
情绪障碍通常未被充分识别和治疗,因为诊断依赖于自我报告和临床评估,而这些评估往往不够及时。情绪障碍患者的言语特征与健康个体不同。随着智能手机的广泛使用和机器学习方法的出现,智能手机可用于监测言语模式,以帮助诊断和监测情绪障碍。
本综述的目的是综合使用智能手机上的言语模式来诊断和监测情绪障碍的研究。
通过对主要数据库(Medline、PsycInfo、EMBASE 和 CINAHL)进行文献检索,使用“情绪障碍”、“智能手机”、“语音分析”及其变体等搜索词,最初确定了 832 篇相关文章。只有 13 项研究符合纳入标准:使用智能手机捕获语音数据,专注于诊断或监测一种(多种)情绪障碍,前瞻性招募临床人群,且仅限英文文献。由 2 名评审员评估文章,并提取数据类型、使用的分类器、捕获方法和研究结果。使用叙述性综合方法分析研究。
研究表明,仅基于客观监测的言语模式,语音数据在预测情绪状态和情绪波动方面具有合理的准确性。虽然不同传感器模式的融合显示出最高的准确性(97.4%),但近 80%的纳入研究是没有对照组的试点试验或可行性研究,且样本量较小,范围为 1 至 73 名参与者。研究还在短时间或不同时间范围内进行,在捕获的音频数据类型、环境背景、分类器以及控制隐私和环境噪声的措施方面存在显著的方法异质性。
基于智能手机监测情绪障碍言语模式的方法正在迅速发展。现有证据支持使用言语模式实时监测、分类和预测情绪状态。然而,在这种方法的稳健性、成本效益和可接受性方面仍然存在许多挑战,需要进一步的工作来建立在现有研究的基础上,并减少方法学的异质性以及对这种方法的益处和风险的临床评估。