Haridas Namitha Thalekkara, Sanchez-Bornot Jose M, McClean Paula L, Wong-Lin KongFatt
Intelligent Systems Research Centre, School of Computing, Engineering and Intelligent Systems Ulster University, Magee campus Derry∼Londonderry Northern Ireland UK.
Personalised Medicine Centre, School of Medicine Ulster University, Magee campus Derry∼Londonderry Northern Ireland UK.
Healthc Technol Lett. 2024 Sep 15;11(6):452-460. doi: 10.1049/htl2.12091. eCollection 2024 Dec.
Missing Alzheimer's disease (AD) data is prevalent and poses significant challenges for AD diagnosis. Previous studies have explored various data imputation approaches on AD data, but the systematic evaluation of deep learning algorithms for imputing heterogeneous and comprehensive AD data is limited. This study investigates the efficacy of denoising autoencoder-based imputation of missing key features of heterogeneous data that comprised tau-PET, MRI, cognitive and functional assessments, genotype, sociodemographic, and medical history. The authors focused on extreme (≥40%) missing at random of key features which depend on AD progression; identified as the history of a mother having AD, APoE ε4 alleles, and clinical dementia rating. Along with features selected using traditional feature selection methods, latent features extracted from the denoising autoencoder are incorporated for subsequent classification. Using random forest classification with 10-fold cross-validation, robust AD predictive performance of imputed datasets (accuracy: 79%-85%; precision: 71%-85%) across missingness levels, and high recall values with 40% missingness are found. Further, the feature-selected dataset using feature selection methods, including autoencoder, demonstrated higher classification score than that of the original complete dataset. These results highlight the effectiveness and robustness of autoencoder in imputing crucial information for reliable AD prediction in AI-based clinical decision support systems.
阿尔茨海默病(AD)数据缺失的情况很普遍,给AD诊断带来了重大挑战。以往的研究已经探索了针对AD数据的各种数据插补方法,但对用于插补异质且全面的AD数据的深度学习算法的系统评估有限。本研究调查了基于去噪自编码器插补异质数据关键缺失特征的有效性,这些数据包括tau-PET、MRI、认知和功能评估、基因型、社会人口统计学和病史。作者关注的是与AD进展相关的关键特征随机出现的极端(≥40%)缺失情况;这些特征被确定为母亲患有AD的病史、APoE ε4等位基因和临床痴呆评定量表。除了使用传统特征选择方法选择的特征外,还将从去噪自编码器中提取的潜在特征纳入后续分类。通过使用具有10折交叉验证的随机森林分类,发现插补数据集在不同缺失水平上具有稳健的AD预测性能(准确率:79%-85%;精确率:71%-85%),并且在缺失率为40%时具有较高的召回值。此外,使用包括自编码器在内的特征选择方法选择的特征数据集,其分类得分高于原始完整数据集。这些结果突出了自编码器在为基于人工智能的临床决策支持系统中可靠的AD预测插补关键信息方面的有效性和稳健性。