Department of Molecular Medicine, Cornell University;
J Vis Exp. 2023 Dec 15(202). doi: 10.3791/65910.
Large omics datasets are becoming increasingly available for research into human health. This paper presents DeepOmicsAE, a workflow optimized for the analysis of multi-omics datasets, including proteomics, metabolomics, and clinical data. This workflow employs a type of neural network called autoencoder, to extract a concise set of features from the high-dimensional multi-omics input data. Furthermore, the workflow provides a method to optimize the key parameters needed to implement the autoencoder. To showcase this workflow, clinical data were analyzed from a cohort of 142 individuals who were either healthy or diagnosed with Alzheimer's disease, along with the proteome and metabolome of their postmortem brain samples. The features extracted from the latent layer of the autoencoder retain the biological information that separates healthy and diseased patients. In addition, the individual extracted features represent distinct molecular signaling modules, each of which interacts uniquely with the individuals' clinical features, providing for a mean to integrate the proteomics, metabolomics, and clinical data.
大型组学数据集越来越多地可用于研究人类健康。本文提出了 DeepOmicsAE,这是一种针对多组学数据集分析进行优化的工作流程,包括蛋白质组学、代谢组学和临床数据。该工作流程采用一种称为自动编码器的神经网络类型,从高维多组学输入数据中提取简洁的特征集。此外,该工作流程还提供了一种优化实现自动编码器所需关键参数的方法。为了展示这个工作流程,对来自 142 名个体的队列的临床数据进行了分析,这些个体要么健康,要么被诊断患有阿尔茨海默病,以及他们死后大脑样本的蛋白质组和代谢组。自动编码器的潜在层中提取的特征保留了区分健康和患病患者的生物学信息。此外,个体提取的特征代表不同的分子信号模块,每个模块都与个体的临床特征独特地相互作用,为整合蛋白质组学、代谢组学和临床数据提供了一种方法。