School of Computer Science and Mathematics, Liverpool John Moores University, Liverpool, United Kingdom.
University of Information Technology and Communications, Baghdad, Iraq.
PLoS One. 2023 May 1;18(5):e0283712. doi: 10.1371/journal.pone.0283712. eCollection 2023.
The increasing incidence of Alzheimer's disease (AD) has been leading towards a significant growth in socioeconomic challenges. A reliable prediction of AD might be useful to mitigate or at-least slow down its progression for which, identification of the factors affecting the AD and its accurate diagnoses, are vital. In this study, we use Genome-Wide Association Studies (GWAS) dataset which comprises significant genetic markers of complex diseases. The original dataset contains large number of attributes (620901) for which we propose a hybrid feature selection approach based on association test, principal component analysis, and the Boruta algorithm, to identify the most promising predictors of AD. The selected features are then forwarded to a wide and deep neural network models to classify the AD cases and healthy controls. The experimental outcomes indicate that our approach outperformed the existing methods when evaluated on standard dataset, producing an accuracy and f1-score of 99%. The outcomes from this study are impactful particularly, the identified features comprising AD-associated genes and a reliable classification model that might be useful for other chronic diseases.
阿尔茨海默病(AD)的发病率不断上升,导致社会经济挑战显著增加。对 AD 进行可靠的预测可能有助于减轻或至少减缓其进展,为此,确定影响 AD 的因素及其准确诊断至关重要。在这项研究中,我们使用了包含复杂疾病重要遗传标记物的全基因组关联研究(GWAS)数据集。原始数据集包含大量属性(620901 个),我们提出了一种基于关联测试、主成分分析和 Boruta 算法的混合特征选择方法,以确定 AD 最有前途的预测因子。然后,将选定的特征转发给广泛而深入的神经网络模型,以对 AD 病例和健康对照组进行分类。实验结果表明,我们的方法在标准数据集上的评估优于现有方法,准确率和 F1 得分为 99%。这项研究的结果具有重大影响,特别是确定的特征包括与 AD 相关的基因和可靠的分类模型,这可能对其他慢性疾病有用。