Suppr超能文献

一个可解释的深度学习框架识别出阿尔茨海默病的蛋白质组驱动因素。

An interpretable deep learning framework identifies proteomic drivers of Alzheimer's disease.

作者信息

Panizza Elena, Cerione Richard A

机构信息

Department of Molecular Medicine, Cornell University, Ithaca, NY, United States.

Department of Chemistry and Chemical Biology, Cornell University, Ithaca, NY, United States.

出版信息

Front Cell Dev Biol. 2024 Sep 17;12:1379984. doi: 10.3389/fcell.2024.1379984. eCollection 2024.

Abstract

Alzheimer's disease (AD) is the leading neurodegenerative pathology in aged individuals, but many questions remain on its pathogenesis, and a cure is still not available. Recent research efforts have generated measurements of multiple omics in individuals that were healthy or diagnosed with AD. Although machine learning approaches are well-suited to handle the complexity of omics data, the models typically lack interpretability. Additionally, while the genetic landscape of AD is somewhat more established, the proteomic landscape of the diseased brain is less well-understood. Here, we establish a deep learning method that takes advantage of an ensemble of autoencoders (AEs) - EnsembleOmicsAE-to reduce the complexity of proteomics data into a reduced space containing a small number of latent features. We combine brain proteomic data from 559 individuals across three AD cohorts and demonstrate that the ensemble autoencoder models generate stable latent features which are well-suited for downstream biological interpretation. We present an algorithm to calculate feature importance scores based on the iterative scrambling of individual input features (i.e., proteins) and show that the algorithm identifies signaling modules (AE signaling modules) that are significantly enriched in protein-protein interactions. The molecular drivers of AD identified within the AE signaling modules derived with EnsembleOmicsAE were missed by linear methods, including integrin signaling and cell adhesion. Finally, we characterize the relationship between the AE signaling modules and the age of death of the patients and identify a differential regulation of vimentin and MAPK signaling in younger compared with older AD patients.

摘要

阿尔茨海默病(AD)是老年人中主要的神经退行性病变,但关于其发病机制仍存在许多问题,且目前尚无治愈方法。最近的研究工作已对健康个体或被诊断为AD的个体进行了多种组学测量。尽管机器学习方法非常适合处理组学数据的复杂性,但这些模型通常缺乏可解释性。此外,虽然AD的遗传图谱已较为明确,但患病大脑的蛋白质组图谱却了解较少。在此,我们建立了一种深度学习方法,利用自动编码器(AE)的集成——集成组学自动编码器(EnsembleOmicsAE)——将蛋白质组学数据的复杂性降低到一个包含少量潜在特征的低维空间。我们整合了来自三个AD队列的559名个体的脑蛋白质组数据,并证明集成自动编码器模型生成的稳定潜在特征非常适合下游的生物学解释。我们提出了一种基于对单个输入特征(即蛋白质)进行迭代加扰来计算特征重要性得分的算法,并表明该算法识别出在蛋白质 - 蛋白质相互作用中显著富集的信号模块(AE信号模块)。通过线性方法(包括整合素信号传导和细胞粘附)未发现的、在由EnsembleOmicsAE衍生的AE信号模块中确定的AD分子驱动因素。最后,我们表征了AE信号模块与患者死亡年龄之间的关系,并确定了与老年AD患者相比,年轻AD患者中波形蛋白和丝裂原活化蛋白激酶(MAPK)信号传导的差异调节。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8599/11442384/6e073d0cc2e4/fcell-12-1379984-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验