Suppr超能文献

整合机器学习方法以剖析阿尔茨海默病中基因推断的转录组图谱

Integration of Machine Learning Methods to Dissect Genetically Imputed Transcriptomic Profiles in Alzheimer's Disease.

作者信息

Maj Carlo, Azevedo Tiago, Giansanti Valentina, Borisov Oleg, Dimitri Giovanna Maria, Spasov Simeon, Lió Pietro, Merelli Ivan

机构信息

Institute for Genomic Statistics and Bioinformatics, University Hospital Bonn, Bonn, Germany.

Department of Computer Science and Technology, University of Cambridge, Cambridge, United Kingdom.

出版信息

Front Genet. 2019 Sep 3;10:726. doi: 10.3389/fgene.2019.00726. eCollection 2019.

Abstract

The genetic component of many common traits is associated with the gene expression and several variants act as expression quantitative loci, regulating the gene expression in a tissue specific manner. In this work, we applied tissue-specific cis-eQTL gene expression prediction models on the genotype of 808 samples including controls, subjects with mild cognitive impairment, and patients with Alzheimer's Disease. We then dissected the imputed transcriptomic profiles by means of different unsupervised and supervised machine learning approaches to identify potential biological associations. Our analysis suggests that unsupervised and supervised methods can provide complementary information, which can be integrated for a better characterization of the underlying biological system. In particular, a variational autoencoder representation of the transcriptomic profiles, followed by a support vector machine classification, has been used for tissue-specific gene prioritizations. Interestingly, the achieved gene prioritizations can be efficiently integrated as a feature selection step for improving the accuracy of deep learning classifier networks. The identified gene-tissue information suggests a potential role for inflammatory and regulatory processes in gut-brain axis related tissues. In line with the expected low heritability that can be apportioned to eQTL variants, we were able to achieve only relatively low prediction capability with deep learning classification models. However, our analysis revealed that the classification power strongly depends on the network structure, with recurrent neural networks being the best performing network class. Interestingly, cross-tissue analysis suggests a potentially greater role of models trained in brain tissues also by considering dementia-related endophenotypes. Overall, the present analysis suggests that the combination of supervised and unsupervised machine learning techniques can be used for the evaluation of high dimensional omics data.

摘要

许多常见性状的遗传成分与基因表达相关,一些变异体作为表达数量性状位点,以组织特异性方式调节基因表达。在这项研究中,我们将组织特异性顺式表达数量性状基因座(cis-eQTL)基因表达预测模型应用于808个样本的基因型,这些样本包括对照组、轻度认知障碍受试者和阿尔茨海默病患者。然后,我们通过不同的无监督和有监督机器学习方法剖析估算的转录组图谱,以识别潜在的生物学关联。我们的分析表明,无监督和有监督方法可以提供互补信息,可将这些信息整合起来以更好地表征潜在的生物系统。特别是,转录组图谱的变分自编码器表示,随后进行支持向量机分类,已用于组织特异性基因优先级排序。有趣的是,所实现的基因优先级排序可以有效地作为特征选择步骤进行整合,以提高深度学习分类器网络的准确性。所识别的基因-组织信息表明炎症和调节过程在肠-脑轴相关组织中具有潜在作用。与可归因于表达数量性状位点变异体的预期低遗传力一致,我们使用深度学习分类模型仅实现了相对较低的预测能力。然而,我们的分析表明,分类能力很大程度上取决于网络结构,循环神经网络是表现最佳的网络类别。有趣的是,跨组织分析表明,通过考虑与痴呆相关的内表型,在脑组织中训练的模型可能也具有更大的作用。总体而言,本分析表明,有监督和无监督机器学习技术的结合可用于评估高维组学数据。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/44ca/6735530/c0480eef254d/fgene-10-00726-g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验