Department of Computer Engineering, Faculty of Engineering, Chulalongkorn University, Bangkok 10330, Thailand.
Department of Mathematics and Computer Science, Faculty of Science, Chulalongkorn University, Bangkok 10330, Thailand.
Sensors (Basel). 2022 Feb 17;22(4):1583. doi: 10.3390/s22041583.
The Montreal cognitive assessment (MoCA), a widely accepted screening tool for identifying patients with mild cognitive impairment (MCI), includes a language fluency test of verbal functioning; its scores are based on the number of unique correct words produced by the test taker. However, it is possible that unique words may be counted differently for various languages. This study focuses on Thai as a language that differs from English in terms of word combinations. We applied various automatic speech recognition (ASR) techniques to develop an assisted scoring system for the MoCA language fluency test with Thai language support. This was a challenge because Thai is a low-resource language for which domain-specific data are not publicly available, especially speech data from patients with MCIs. Furthermore, the great variety of pronunciation, intonation, tone, and accent of the patients, all of which might differ from healthy controls, bring more complexity to the model. We propose a hybrid time delay neural network hidden Markov model (TDNN-HMM) architecture for acoustic model training to create our ASR system that is robust to environmental noise and to the variation of voice quality impacted by MCI. The LOTUS Thai speech corpus was incorporated into the training set to improve the model's generalization. A preprocessing algorithm was implemented to reduce the background noise and improve the overall data quality before feeding data into the TDNN-HMM system for automatic word detection and language fluency score calculation. The results show that the TDNN-HMM model in combination with data augmentation using lattice-free maximum mutual information (LF-MMI) objective function provides a word error rate (WER) of 30.77%. To our knowledge, this is the first study to develop an ASR with Thai language support to automate the scoring system of MoCA's language fluency assessment.
蒙特利尔认知评估(MoCA)是一种广泛接受的识别轻度认知障碍(MCI)患者的筛查工具,其中包括语言流畅性测试,评估言语功能;其分数基于测试者产生的独特正确单词的数量。然而,不同语言的独特单词可能会有不同的计数方式。本研究关注的是泰语,它在单词组合方面与英语不同。我们应用了各种自动语音识别(ASR)技术,为 MoCA 语言流畅性测试开发了一个支持泰语的辅助评分系统。这是一个挑战,因为泰语是一种资源匮乏的语言,针对该语言的特定领域数据不可用,尤其是 MCI 患者的语音数据。此外,患者的发音、语调、音调和口音差异很大,与健康对照组不同,这给模型带来了更多的复杂性。我们提出了一种混合时滞神经网络隐马尔可夫模型(TDNN-HMM)架构,用于声学模型训练,以创建我们的 ASR 系统,该系统对环境噪声和受 MCI 影响的语音质量变化具有鲁棒性。将 LOTUS 泰语语音语料库纳入训练集,以提高模型的泛化能力。在将数据输入 TDNN-HMM 系统进行自动单词检测和语言流畅性评分计算之前,实现了一个预处理算法来减少背景噪声并提高整体数据质量。结果表明,结合使用无格最大互信息(LF-MMI)目标函数进行数据增强的 TDNN-HMM 模型可将单词错误率(WER)降低至 30.77%。据我们所知,这是第一个开发具有泰语支持的 ASR 以实现 MoCA 语言流畅性评估自动评分系统的研究。