• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

变分自编码器可学习代谢组学数据的可迁移表示。

Variational autoencoders learn transferrable representations of metabolomics data.

机构信息

Institute of Computational Biology, Helmholtz Center Munich-German Research Center for Environmental Health, 85764, Neuherberg, Germany.

Technical University of Munich-School of Life Sciences, 85354, Freising, Germany.

出版信息

Commun Biol. 2022 Jun 30;5(1):645. doi: 10.1038/s42003-022-03579-3.

DOI:10.1038/s42003-022-03579-3
PMID:35773471
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9246987/
Abstract

Dimensionality reduction approaches are commonly used for the deconvolution of high-dimensional metabolomics datasets into underlying core metabolic processes. However, current state-of-the-art methods are widely incapable of detecting nonlinearities in metabolomics data. Variational Autoencoders (VAEs) are a deep learning method designed to learn nonlinear latent representations which generalize to unseen data. Here, we trained a VAE on a large-scale metabolomics population cohort of human blood samples consisting of over 4500 individuals. We analyzed the pathway composition of the latent space using a global feature importance score, which demonstrated that latent dimensions represent distinct cellular processes. To demonstrate model generalizability, we generated latent representations of unseen metabolomics datasets on type 2 diabetes, acute myeloid leukemia, and schizophrenia and found significant correlations with clinical patient groups. Notably, the VAE representations showed stronger effects than latent dimensions derived by linear and non-linear principal component analysis. Taken together, we demonstrate that the VAE is a powerful method that learns biologically meaningful, nonlinear, and transferrable latent representations of metabolomics data.

摘要

降维方法常用于将高维代谢组学数据集分解为潜在的核心代谢过程。然而,当前最先进的方法通常无法检测代谢组学数据中的非线性。变分自编码器(VAEs)是一种深度学习方法,旨在学习能够推广到未见数据的非线性潜在表示。在这里,我们在一个由超过 4500 个人组成的大规模人类血液样本代谢组学群体上训练了一个 VAE。我们使用全局特征重要性评分分析了潜在空间的途径组成,结果表明潜在维度代表了不同的细胞过程。为了证明模型的泛化能力,我们对 2 型糖尿病、急性髓系白血病和精神分裂症的未见代谢组学数据集生成了潜在表示,并发现与临床患者群体存在显著相关性。值得注意的是,VAEs 表示比线性和非线性主成分分析得出的潜在维度具有更强的效果。总之,我们证明了 VAE 是一种强大的方法,可以学习代谢组学数据中具有生物学意义的、非线性的和可转移的潜在表示。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a761/9246987/6a1c733ea4b4/42003_2022_3579_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a761/9246987/9b0ff80603e8/42003_2022_3579_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a761/9246987/5a2cf0b54143/42003_2022_3579_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a761/9246987/1ab58581df80/42003_2022_3579_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a761/9246987/6a1c733ea4b4/42003_2022_3579_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a761/9246987/9b0ff80603e8/42003_2022_3579_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a761/9246987/5a2cf0b54143/42003_2022_3579_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a761/9246987/1ab58581df80/42003_2022_3579_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a761/9246987/6a1c733ea4b4/42003_2022_3579_Fig4_HTML.jpg

相似文献

1
Variational autoencoders learn transferrable representations of metabolomics data.变分自编码器可学习代谢组学数据的可迁移表示。
Commun Biol. 2022 Jun 30;5(1):645. doi: 10.1038/s42003-022-03579-3.
2
An Overview of Variational Autoencoders for Source Separation, Finance, and Bio-Signal Applications.用于源分离、金融和生物信号应用的变分自编码器概述。
Entropy (Basel). 2021 Dec 28;24(1):55. doi: 10.3390/e24010055.
3
Explaining deep learning-based representations of resting state functional connectivity data: focusing on interpreting nonlinear patterns in autism spectrum disorder.解释基于深度学习的静息态功能连接数据表征:聚焦于解读自闭症谱系障碍中的非线性模式。
Front Psychiatry. 2024 May 20;15:1397093. doi: 10.3389/fpsyt.2024.1397093. eCollection 2024.
4
Attri-VAE: Attribute-based interpretable representations of medical images with variational autoencoders.Attri-VAE:基于属性的医学图像可解释表示与变分自编码器
Comput Med Imaging Graph. 2023 Mar;104:102158. doi: 10.1016/j.compmedimag.2022.102158. Epub 2022 Dec 9.
5
Compressing gene expression data using multiple latent space dimensionalities learns complementary biological representations.使用多个潜在空间维度压缩基因表达数据可学习互补的生物学表现形式。
Genome Biol. 2020 May 11;21(1):109. doi: 10.1186/s13059-020-02021-3.
6
Explaining Deep Learning-Based Representations of Resting State Functional Connectivity Data: Focusing on Interpreting Nonlinear Patterns in Autism Spectrum Disorder.解释基于深度学习的静息态功能连接数据表示:聚焦于解读自闭症谱系障碍中的非线性模式。
bioRxiv. 2023 Sep 13:2023.09.13.557591. doi: 10.1101/2023.09.13.557591.
7
Predicting drug polypharmacology from cell morphology readouts using variational autoencoder latent space arithmetic.基于变分自动编码器潜在空间算法从细胞形态读取结果预测药物多效性。
PLoS Comput Biol. 2022 Feb 25;18(2):e1009888. doi: 10.1371/journal.pcbi.1009888. eCollection 2022 Feb.
8
Predictive variational autoencoder for learning robust representations of time-series data.用于学习时间序列数据稳健表示的预测变分自编码器。
ArXiv. 2023 Dec 12:arXiv:2312.06932v1.
9
Parameter tuning is a key part of dimensionality reduction via deep variational autoencoders for single cell RNA transcriptomics.参数调整是通过深度变分自编码器进行单细胞RNA转录组学降维的关键部分。
Pac Symp Biocomput. 2019;24:362-373.
10
Biologically informed variational autoencoders allow predictive modeling of genetic and drug-induced perturbations.生物学启发的变分自动编码器允许对遗传和药物诱导的扰动进行预测建模。
Bioinformatics. 2023 Jun 1;39(6). doi: 10.1093/bioinformatics/btad387.

引用本文的文献

1
Cytoplasmic dynamics are overlooked in single nuclei RNA-seq but can be rescued by CytoRescue, a generative AI model to recover cytoplasm enriched gene.细胞质动力学在单核RNA测序中被忽视,但可以通过CytoRescue挽救,CytoRescue是一种用于恢复富含细胞质基因的生成式人工智能模型。
bioRxiv. 2025 Aug 21:2025.08.15.670239. doi: 10.1101/2025.08.15.670239.
2
Model Predictive Control on the Neural Manifold.神经流形上的模型预测控制
ArXiv. 2025 Aug 11:arXiv:2406.14801v2.
3
Digital Alchemy: The Rise of Machine and Deep Learning in Small-Molecule Drug Discovery.

本文引用的文献

1
Anthracyclins Increase PUFAs: Potential Implications in ER Stress and Cell Death.蒽环类抗生素增加多不饱和脂肪酸:内质网应激和细胞死亡中的潜在意义。
Cells. 2021 May 11;10(5):1163. doi: 10.3390/cells10051163.
2
Deep metabolome: Applications of deep learning in metabolomics.深度代谢组学:深度学习在代谢组学中的应用
Comput Struct Biotechnol J. 2020 Oct 1;18:2818-2825. doi: 10.1016/j.csbj.2020.09.033. eCollection 2020.
3
Compressing gene expression data using multiple latent space dimensionalities learns complementary biological representations.
数字炼金术:小分子药物发现中机器学习与深度学习的兴起
Int J Mol Sci. 2025 Jul 16;26(14):6807. doi: 10.3390/ijms26146807.
4
Generative AI extracts ecological meaning from the complex three dimensional shapes of bird bills.生成式人工智能从鸟类喙部复杂的三维形状中提取生态意义。
PLoS Comput Biol. 2025 Mar 17;21(3):e1012887. doi: 10.1371/journal.pcbi.1012887. eCollection 2025 Mar.
5
AI-enabled manufacturing process discovery.基于人工智能的制造过程发现。
PNAS Nexus. 2025 Feb 20;4(2):pgaf054. doi: 10.1093/pnasnexus/pgaf054. eCollection 2025 Feb.
6
Disentangling genotype and environment specific latent features for improved trait prediction using a compositional autoencoder.使用组合自动编码器解析基因型和环境特异性潜在特征以改进性状预测。
Front Plant Sci. 2024 Dec 16;15:1476070. doi: 10.3389/fpls.2024.1476070. eCollection 2024.
7
AI-Assisted Identification of Primary and Secondary Metabolomic Markers for Postoperative Delirium.人工智能辅助识别术后谵妄的原发性和继发性代谢组学标志物。
Int J Mol Sci. 2024 Nov 4;25(21):11847. doi: 10.3390/ijms252111847.
8
A roadmap to the molecular human linking multiomics with population traits and diabetes subtypes.人类分子关联多组学与群体特征和糖尿病亚型的路线图。
Nat Commun. 2024 Aug 19;15(1):7111. doi: 10.1038/s41467-024-51134-x.
9
Unsupervised representation learning on high-dimensional clinical data improves genomic discovery and prediction.基于高维临床数据的无监督表示学习可改善基因组发现和预测。
Nat Genet. 2024 Aug;56(8):1604-1613. doi: 10.1038/s41588-024-01831-6. Epub 2024 Jul 8.
10
Optimal transport for automatic alignment of untargeted metabolomic data.最优传输在非靶向代谢组学数据自动配准中的应用。
Elife. 2024 Jun 18;12:RP91597. doi: 10.7554/eLife.91597.
使用多个潜在空间维度压缩基因表达数据可学习互补的生物学表现形式。
Genome Biol. 2020 May 11;21(1):109. doi: 10.1186/s13059-020-02021-3.
4
PathME: pathway based multi-modal sparse autoencoders for clustering of patient-level multi-omics data.PathME:基于通路的多模态稀疏自动编码器,用于对患者层面多组学数据进行聚类。
BMC Bioinformatics. 2020 Apr 16;21(1):146. doi: 10.1186/s12859-020-3465-2.
5
scGen predicts single-cell perturbation responses.scGen 预测单细胞扰动反应。
Nat Methods. 2019 Aug;16(8):715-721. doi: 10.1038/s41592-019-0494-8. Epub 2019 Jul 29.
6
The Consortium of Metabolomics Studies (COMETS): Metabolomics in 47 Prospective Cohort Studies.COMETS 联盟:47 项前瞻性队列研究中的代谢组学研究。
Am J Epidemiol. 2019 Jun 1;188(6):991-1012. doi: 10.1093/aje/kwz028.
7
Dr.VAE: improving drug response prediction via modeling of drug perturbation effects.VAE 博士:通过建模药物干扰效应来改善药物反应预测。
Bioinformatics. 2019 Oct 1;35(19):3743-3751. doi: 10.1093/bioinformatics/btz158.
8
Characterization of missing values in untargeted MS-based metabolomics data and evaluation of missing data handling strategies.基于非靶向 MS 的代谢组学数据中缺失值的特征描述及缺失数据处理策略的评价。
Metabolomics. 2018 Sep 20;14(10):128. doi: 10.1007/s11306-018-1420-2.
9
Gene expression variability across cells and species shapes innate immunity.基因表达的变异性在细胞和物种之间塑造了先天免疫。
Nature. 2018 Nov;563(7730):197-202. doi: 10.1038/s41586-018-0657-2. Epub 2018 Oct 24.
10
The prevalence of metabolic syndrome in patients receiving antipsychotics in Qatar: a cross sectional comparative study.在卡塔尔接受抗精神病药物治疗的患者中代谢综合征的流行情况:一项横断面比较研究。
BMC Psychiatry. 2018 Mar 27;18(1):81. doi: 10.1186/s12888-018-1662-6.