Varzandian Ali, Razo Miguel Angel Sanchez, Sanders Michael Richard, Atmakuru Akhila, Di Fatta Giuseppe
Department of Computer Science, University of Reading, Reading, United Kingdom.
Front Neurosci. 2021 May 28;15:673120. doi: 10.3389/fnins.2021.673120. eCollection 2021.
Machine Learning methods are often adopted to infer useful biomarkers for the early diagnosis of many neurodegenerative diseases and, in general, of neuroanatomical ageing. Some of these methods estimate the subject age from morphological brain data, which is then indicated as "brain age". The difference between such a predicted brain age and the actual chronological age of a subject can be used as an indication of a pathological deviation from normal brain ageing. An important use of the brain age model as biomarker is the prediction of Alzheimer's disease (AD) from structural Magnetic Resonance Imaging (MRI). Many different machine learning approaches have been applied to this specific predictive task, some of which have achieved high accuracy at the expense of the descriptiveness of the model. This work investigates an appropriate combination of data science techniques and linear models to provide, at the same time, high accuracy and good descriptiveness. The proposed method is based on a data workflow that include typical data science methods, such as outliers detection, feature selection, linear regression, and logistic regression. In particular, a novel inductive bias is introduced in the regression model, which is aimed at improving the accuracy and the specificity of the classification task. The method is compared to other machine learning approaches for AD classification based on morphological brain data with and without the use of the brain age, including Support Vector Machines and Deep Neural Networks. This study adopts brain MRI scans of 1, 901 subjects which have been acquired from three repositories (ADNI, AIBL, and IXI). A predictive model based only on the proposed apparent brain age and the chronological age has an accuracy of 88% and 92%, respectively, for male and female subjects, in a repeated cross-validation analysis, thus achieving a comparable or superior performance than state of the art machine learning methods. The advantage of the proposed method is that it maintains the morphological semantics of the input space throughout the regression and classification tasks. The accurate predictive model is also highly descriptive and can be used to generate potentially useful insights on the predictions.
机器学习方法常被用于推断许多神经退行性疾病以及一般神经解剖学衰老早期诊断的有用生物标志物。其中一些方法从大脑形态数据估计受试者年龄,该年龄随后被称为“脑龄”。受试者预测脑龄与实际 chronological 年龄之间的差异可作为与正常脑衰老病理偏差的指标。脑龄模型作为生物标志物的一个重要用途是从结构磁共振成像(MRI)预测阿尔茨海默病(AD)。许多不同的机器学习方法已应用于这一特定预测任务,其中一些以模型的描述性为代价实现了高精度。这项工作研究了数据科学技术和线性模型的适当组合,以同时提供高精度和良好的描述性。所提出的方法基于一个数据工作流程,包括典型的数据科学方法,如异常值检测、特征选择、线性回归和逻辑回归。特别是,在回归模型中引入了一种新的归纳偏差,旨在提高分类任务的准确性和特异性。该方法与基于有无脑龄的大脑形态数据进行AD分类的其他机器学习方法进行了比较,包括支持向量机和深度神经网络。本研究采用了从三个数据库(ADNI、AIBL和IXI)获取的1901名受试者的脑部MRI扫描数据。在重复交叉验证分析中,仅基于所提出的表观脑龄和 chronological 年龄的预测模型对男性和女性受试者的准确率分别为88%和92%,从而实现了与现有机器学习方法相当或更优的性能。所提出方法的优点是在整个回归和分类任务中保持输入空间的形态语义。准确的预测模型也具有高度的描述性,可用于生成关于预测的潜在有用见解。