Annu Int Conf IEEE Eng Med Biol Soc. 2021 Nov;2021:5788-5791. doi: 10.1109/EMBC46164.2021.9629983.
Alzheimer's disease (AD) is the most prevalent neurodegenerative disorder and the most common form of dementia in the elderly. Because gene is an important clinical risk factor resulting in AD, genomic studies, such as genome-wide association studies (GWAS), have widely been applied into AD studies. However, main shortcomings of GWAS method were that hereditary deletions were evident in the GWAS studies, which resulted in low classification or prediction abilities by using GWAS analysis. Therefore, this paper proposed a novel deep learning genomics approach and applied it to discriminate AD patients and healthy control (HC) subjects. In this study, we selected genotype data of 988 subjects enrolled in the ADNI, including 622 AD patients and 366 HC subjects. The proposed deep learning genomics (DLG) approach was composed of three steps: quality control, SNP genotype coding, and classification. The Resnet framework was used as the DLG model in this study. In the comparative GWAS analysis, APOE ε4 status and the normalized theta-value of the significant SNP loci were seen as predictors to classify genetically using Support Vector Machine (SVM). All data were divided into one training & validation group and one test group. 5-fold cross-validation was used in 500 times. Finally, we compared the classification results between DLG model and traditional GWAS analysis. As a result, the accuracy, sensitivity, and specificity of classification for traditional GWAS analysis was 71.38%±0.63%, 63.13%±2.87% and 85.59%±6.66% in the test group; while the accuracy, sensitivity, and specificity of classification for DLG model was 92.65%±4.80%, 85.00%±16.25% and 97.10%±4.38% in the test group. Hence, the DLG model can achieve higher accuracy and sensitivity when applied to AD. More importantly, we discovered several novel genetic biomarkers of AD, including rs6311 and rs6313 in HTR2A, and rs690705 in RFC3. The roles of these novel loci in AD should be explored future.
阿尔茨海默病(AD)是最常见的神经退行性疾病,也是老年人中最常见的痴呆症形式。由于基因是导致 AD 的重要临床风险因素,因此基因组研究,如全基因组关联研究(GWAS),已广泛应用于 AD 研究。然而,GWAS 方法的主要缺点是 GWAS 研究中明显存在遗传缺失,这导致使用 GWAS 分析的分类或预测能力较低。因此,本文提出了一种新的深度学习基因组学方法,并将其应用于区分 AD 患者和健康对照(HC)受试者。在这项研究中,我们选择了 ADNI 中招募的 988 名受试者的基因型数据,包括 622 名 AD 患者和 366 名 HC 受试者。所提出的深度学习基因组学(DLG)方法由三个步骤组成:质量控制、SNP 基因型编码和分类。在这项研究中,Resnet 框架被用作 DLG 模型。在比较 GWAS 分析中,APOE ε4 状态和显著 SNP 位点的归一化 theta 值被视为使用支持向量机(SVM)进行遗传分类的预测因子。所有数据均分为一组训练和验证组以及一组测试组。在 500 次中进行了 5 折交叉验证。最后,我们比较了 DLG 模型和传统 GWAS 分析的分类结果。结果,传统 GWAS 分析在测试组中的分类准确性、敏感性和特异性分别为 71.38%±0.63%、63.13%±2.87%和 85.59%±6.66%;而 DLG 模型在测试组中的分类准确性、敏感性和特异性分别为 92.65%±4.80%、85.00%±16.25%和 97.10%±4.38%。因此,当应用于 AD 时,DLG 模型可以实现更高的准确性和敏感性。更重要的是,我们发现了一些 AD 的新遗传生物标志物,包括 HTR2A 中的 rs6311 和 rs6313 以及 RFC3 中的 rs690705。这些新位点在 AD 中的作用应在未来进行探索。