Ilias Loukas, Askounis Dimitris
IEEE J Biomed Health Inform. 2022 Aug;26(8):4153-4164. doi: 10.1109/JBHI.2022.3172479. Epub 2022 Aug 11.
Alzheimer's disease (AD) is the main cause of dementia which is accompanied by loss of memory and may lead to severe consequences in peoples' everyday life if not diagnosed on time. Very few works have exploited transformer-based networks and despite the high accuracy achieved, little work has been done in terms of model interpretability. In addition, although Mini-Mental State Exam (MMSE) scores are inextricably linked with the identification of dementia, research works face the task of dementia identification and the task of the prediction of MMSE scores as two separate tasks. In order to address these limitations, we employ several transformer-based models, with BERT achieving the highest accuracy accounting for 87.50%. Concurrently, we propose an interpretable method to detect AD patients based on siamese networks reaching accuracy up to 83.75%. Next, we introduce two multi-task learning models, where the main task refers to the identification of dementia (binary classification), while the auxiliary one corresponds to the identification of the severity of dementia (multiclass classification). Our model obtains accuracy equal to 86.25% on the detection of AD patients in the multi-task learning setting. Finally, we present some new methods to identify the linguistic patterns used by AD patients and non-AD ones, including text statistics, vocabulary uniqueness, word usage, correlations via a detailed linguistic analysis, and explainability techniques (LIME). Findings indicate significant differences in language between AD and non-AD patients.
阿尔茨海默病(AD)是痴呆症的主要病因,它伴随着记忆力丧失,如果不及时诊断,可能会在人们的日常生活中导致严重后果。很少有研究利用基于Transformer的网络,尽管取得了很高的准确率,但在模型可解释性方面的工作却很少。此外,虽然简易精神状态检查表(MMSE)分数与痴呆症的识别有着千丝万缕的联系,但研究工作将痴呆症识别任务和MMSE分数预测任务视为两个独立的任务。为了解决这些局限性,我们采用了几种基于Transformer的模型,其中BERT的准确率最高,为87.50%。同时,我们提出了一种基于连体网络的可解释方法来检测AD患者,准确率高达83.75%。接下来,我们介绍了两种多任务学习模型,其中主要任务是痴呆症的识别(二分类),而辅助任务是痴呆症严重程度的识别(多分类)。我们的模型在多任务学习设置下检测AD患者时的准确率为86.25%。最后,我们提出了一些新方法来识别AD患者和非AD患者使用的语言模式,包括文本统计、词汇独特性、词汇用法、通过详细的语言分析进行相关性分析以及可解释性技术(LIME)。研究结果表明,AD患者和非AD患者在语言方面存在显著差异。