Agbavor Felix, Liang Hualou
School of Biomedical Engineering, Science and Health Systems, Drexel University, Philadelphia, PA 19104, USA.
Brain Sci. 2024 Dec 22;14(12):1292. doi: 10.3390/brainsci14121292.
Cognitive impairment poses a significant global health challenge, emphasizing the critical need for early detection and intervention. Traditional diagnostics like neuroimaging and clinical evaluations are often subjective, costly, and inaccessible, especially in resource-poor settings. Previous research has focused on speech analysis primarily conducted using English data, leaving multilingual settings unexplored.
In this study, we present our results from the INTERSPEECH 2024 TAUKADIAL Challenge, where we aimed to automatically detect mild cognitive impairment (MCI) and predict cognitive scores for English and Chinese speakers (169 in total). Our approach leverages Whisper, a speech foundation model, to extract language-agnostic speech embeddings. We then utilize ensemble models to incorporate task-specific information.
Our model achieved unweighted average recall of 81.83% in an MCI classification task, and root mean squared error of 1.196 in cognitive score prediction task, which placed the model at the second and the first position, respectively, in the ranking for each task. Comparison between language-agnostic and language-specific models reveals the importance of capturing language-specific nuances for accurate cognitive impairment prediction.
This study demonstrates the effectiveness of language-specific ensemble modeling with Whisper embeddings in enabling scalable, non-invasive cognitive health assessments of Alzheimer's disease, achieving state-of-the-art results in multilingual settings.
认知障碍对全球健康构成重大挑战,凸显了早期检测和干预的迫切需求。传统诊断方法,如神经影像学和临床评估,往往具有主观性、成本高且难以实施,尤其是在资源匮乏的环境中。先前的研究主要集中在使用英语数据进行的语音分析上,尚未探索多语言环境。
在本研究中,我们展示了参加INTERSPEECH 2024 TAUKADIAL挑战赛的结果,我们旨在自动检测轻度认知障碍(MCI)并预测英语和汉语使用者(总共169人)的认知分数。我们的方法利用语音基础模型Whisper提取与语言无关的语音嵌入。然后,我们使用集成模型来纳入特定任务的信息。
我们的模型在MCI分类任务中实现了81.83%的未加权平均召回率,在认知分数预测任务中实现了1.196的均方根误差,在每个任务的排名中分别位居第二和第一。与语言无关模型和特定语言模型的比较揭示了捕捉特定语言细微差别对准确预测认知障碍的重要性。
本研究证明了使用Whisper嵌入进行特定语言集成建模在实现对阿尔茨海默病进行可扩展、非侵入性认知健康评估方面的有效性,在多语言环境中取得了领先的成果。