Ter Huurne Daphne, Possemis Nina, Banning Leonie, Gruters Angélique, König Alexandra, Linz Nicklas, Tröger Johannes, Langel Kai, Verhey Frans, de Vugt Marjolein, Ramakers Inez
Alzheimer Center Limburg, School for Mental Health and Neuroscience, Maastricht University, Maastricht, The Netherlands.
Maastricht University Medical Center+ (MUMC+), Maastricht, The Netherlands.
Digit Biomark. 2023 Aug 31;7(1):115-123. doi: 10.1159/000533188. eCollection 2023 Jan-Dec.
We studied the accuracy of the automatic speech recognition (ASR) software by comparing ASR scores with manual scores from a verbal learning test (VLT) and a semantic verbal fluency (SVF) task in a semiautomated phone assessment in a memory clinic population. Furthermore, we examined the differentiating value of these tests between participants with subjective cognitive decline (SCD) and mild cognitive impairment (MCI). We also investigated whether the automatically calculated speech and linguistic features had an additional value compared to the commonly used total scores in a semiautomated phone assessment.
We included 94 participants from the memory clinic of the Maastricht University Medical Center+ (SCD = 56 and MCI = 38). The test leader guided the participant through a semiautomated phone assessment. The VLT and SVF were audio recorded and processed via a mobile application. The recall count and speech and linguistic features were automatically extracted. The diagnostic groups were classified by training machine learning classifiers to differentiate SCD and MCI participants.
The intraclass correlation for inter-rater reliability between the manual and the ASR total word count was 0.89 (95% CI 0.09-0.97) for the VLT immediate recall, 0.94 (95% CI 0.68-0.98) for the VLT delayed recall, and 0.93 (95% CI 0.56-0.97) for the SVF. The full model including the total word count and speech and linguistic features had an area under the curve of 0.81 and 0.77 for the VLT immediate and delayed recall, respectively, and 0.61 for the SVF.
There was a high agreement between the ASR and manual scores, keeping the broad confidence intervals in mind. The phone-based VLT was able to differentiate between SCD and MCI and can have opportunities for clinical trial screening.
我们通过在记忆门诊人群的半自动电话评估中,将自动语音识别(ASR)软件的分数与言语学习测试(VLT)和语义言语流畅性(SVF)任务的人工评分进行比较,研究了ASR软件的准确性。此外,我们检验了这些测试在主观认知衰退(SCD)参与者和轻度认知障碍(MCI)参与者之间的区分价值。我们还研究了在半自动电话评估中,与常用的总分相比,自动计算的语音和语言特征是否具有额外价值。
我们纳入了来自马斯特里赫特大学医学中心+记忆门诊的94名参与者(SCD = 56名,MCI = 38名)。测试负责人通过半自动电话评估指导参与者。VLT和SVF通过移动应用程序进行音频录制和处理。自动提取回忆计数以及语音和语言特征。通过训练机器学习分类器来区分SCD和MCI参与者,从而对诊断组进行分类。
对于VLT即时回忆,人工评分与ASR总词数之间的评分者间信度的组内相关性为0.89(95%CI 0.09 - 0.97);对于VLT延迟回忆,为0.94(95%CI 0.68 - 0.98);对于SVF,为0.93(95%CI 0.56 - 0.97)。包括总词数以及语音和语言特征的完整模型,对于VLT即时和延迟回忆的曲线下面积分别为0.81和0.77,对于SVF为0.61。
考虑到宽泛的置信区间,ASR与人工评分之间存在高度一致性。基于电话的VLT能够区分SCD和MCI,并且在临床试验筛查方面具有机会。