Stegmann Gabriela M, Hahn Shira, Liss Julie, Shefner Jeremy, Rutkove Seward B, Kawabata Kan, Bhandari Samarth, Shelton Kerisa, Duncan Cayla Jessica, Berisha Visar
Arizona State University, Phoenix, Arizona, USA.
Aural Analytics, Scottsdale, Arizona, USA.
Digit Biomark. 2020 Dec 2;4(3):109-122. doi: 10.1159/000511671. eCollection 2020 Sep-Dec.
Changes in speech have the potential to provide important information on the diagnosis and progression of various neurological diseases. Many researchers have relied on open-source speech features to develop algorithms for measuring speech changes in clinical populations as they are convenient and easy to use. However, the repeatability of open-source features in the context of neurological diseases has not been studied.
We used a longitudinal sample of healthy controls, individuals with amyotrophic lateral sclerosis, and individuals with suspected frontotemporal dementia, and we evaluated the repeatability of acoustic and language features separately on these 3 data sets.
Repeatability was evaluated using intraclass correlation (ICC) and the within-subjects coefficient of variation (WSCV). In 3 sets of tasks, the median ICC were between 0.02 and 0.55, and the median WSCV were between 29 and 79%.
Our results demonstrate that the repeatability of speech features extracted using open-source tool kits is low. Researchers should exercise caution when developing digital health models with open-source speech features. We provide a detailed summary of feature-by-feature repeatability results (ICC, WSCV, SE of measurement, limits of agreement for WSCV, and minimal detectable change) in the online supplementary material so that researchers may incorporate repeatability information into the models they develop.
言语变化有可能为各种神经系统疾病的诊断和进展提供重要信息。许多研究人员依靠开源语音特征来开发算法,以测量临床人群中的言语变化,因为它们方便易用。然而,开源特征在神经系统疾病背景下的可重复性尚未得到研究。
我们使用了一个纵向样本,包括健康对照者、肌萎缩侧索硬化症患者和疑似额颞叶痴呆患者,并分别在这3个数据集上评估了声学和语言特征的可重复性。
使用组内相关系数(ICC)和受试者内变异系数(WSCV)评估可重复性。在3组任务中,ICC中位数在0.02至0.55之间,WSCV中位数在29%至79%之间。
我们的结果表明,使用开源工具包提取的语音特征的可重复性较低。研究人员在使用开源语音特征开发数字健康模型时应谨慎。我们在在线补充材料中提供了逐个特征的可重复性结果(ICC、WSCV、测量标准误、WSCV的一致性界限和最小可检测变化)的详细摘要,以便研究人员可以将可重复性信息纳入他们开发的模型中。