College of Intelligent Systems Science and Engineering, Harbin Engineering University, Harbin 150001, China.
School of Computer Information and Engineering, Changzhou Institute of Technology, Changzhou 213032, China.
Genes (Basel). 2021 May 1;12(5):683. doi: 10.3390/genes12050683.
The distinguishable subregions that compose the hippocampus are differently involved in functions associated with Alzheimer's disease (AD). Thus, the identification of hippocampal subregions and genes that classify AD and healthy control (HC) groups with high accuracy is meaningful. In this study, by jointly analyzing the multimodal data, we propose a novel method to construct fusion features and a classification method based on the random forest for identifying the important features. Specifically, we construct the fusion features using the gene sequence and subregions correlation to reduce the diversity in same group. Moreover, samples and features are selected randomly to construct a random forest, and genetic algorithm and clustering evolutionary are used to amplify the difference in initial decision trees and evolve the trees. The features in resulting decision trees that reach the peak classification are the important "subregion gene pairs". The findings verify that our method outperforms well in classification performance and generalization. Particularly, we identified some significant subregions and genes, such as hippocampus amygdala transition area (HATA), fimbria, parasubiculum and genes included and . These discoveries provide some new candidate genes for AD and demonstrate the contribution of hippocampal subregions and genes to AD.
组成海马体的可区分亚区在与阿尔茨海默病 (AD) 相关的功能中有着不同的参与。因此,识别海马亚区和基因,以高精度将 AD 组和健康对照组 (HC) 分类是有意义的。在这项研究中,我们通过联合分析多模态数据,提出了一种使用基因序列和亚区相关性构建融合特征的新方法,以及一种基于随机森林的分类方法来识别重要特征。具体来说,我们使用基因序列和亚区相关性构建融合特征,以减少同组之间的多样性。此外,我们随机选择样本和特征来构建随机森林,并使用遗传算法和聚类进化来放大初始决策树之间的差异并进化这些树。在达到峰值分类的决策树中的特征是重要的“亚区基因对”。研究结果验证了我们的方法在分类性能和泛化能力方面的优异表现。特别是,我们确定了一些重要的亚区和基因,如海马杏仁核过渡区 (HATA)、穹窿、副海马旁回和包含的基因 和 。这些发现为 AD 提供了一些新的候选基因,并证明了海马亚区和基因对 AD 的贡献。