文献检索文档翻译深度研究
Suppr Zotero 插件Zotero 插件
邀请有礼套餐&价格历史记录

新学期,新优惠

限时优惠:9月1日-9月22日

30天高级会员仅需29元

1天体验卡首发特惠仅需5.99元

了解详情
不再提醒
插件&应用
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
高级版
套餐订阅购买积分包
AI 工具
文献检索文档翻译深度研究
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2025

新型多组学去混淆变分自动编码器可获得有意义的疾病亚型。

Novel multi-omics deconfounding variational autoencoders can obtain meaningful disease subtyping.

机构信息

BIO3 - Laboratory for Systems Medicine, Department of Human Genetics, KU Leuven, Herestraat 49, 3000 Leuven, Belgium.

Medical Imaging Research Center, University Hospitals Leuven, Herestraat 49, 3000 Leuven, Belgium.

出版信息

Brief Bioinform. 2024 Sep 23;25(6). doi: 10.1093/bib/bbae512.


DOI:10.1093/bib/bbae512
PMID:39413796
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11483139/
Abstract

Unsupervised learning, particularly clustering, plays a pivotal role in disease subtyping and patient stratification, especially with the abundance of large-scale multi-omics data. Deep learning models, such as variational autoencoders (VAEs), can enhance clustering algorithms by leveraging inter-individual heterogeneity. However, the impact of confounders-external factors unrelated to the condition, e.g. batch effect or age-on clustering is often overlooked, introducing bias and spurious biological conclusions. In this work, we introduce four novel VAE-based deconfounding frameworks tailored for clustering multi-omics data. These frameworks effectively mitigate confounding effects while preserving genuine biological patterns. The deconfounding strategies employed include (i) removal of latent features correlated with confounders, (ii) a conditional VAE, (iii) adversarial training, and (iv) adding a regularization term to the loss function. Using real-life multi-omics data from The Cancer Genome Atlas, we simulated various confounding effects (linear, nonlinear, categorical, mixed) and assessed model performance across 50 repetitions based on reconstruction error, clustering stability, and deconfounding efficacy. Our results demonstrate that our novel models, particularly the conditional multi-omics VAE (cXVAE), successfully handle simulated confounding effects and recover biologically driven clustering structures. cXVAE accurately identifies patient labels and unveils meaningful pathological associations among cancer types, validating deconfounded representations. Furthermore, our study suggests that some of the proposed strategies, such as adversarial training, prove insufficient in confounder removal. In summary, our study contributes by proposing innovative frameworks for simultaneous multi-omics data integration, dimensionality reduction, and deconfounding in clustering. Benchmarking on open-access data offers guidance to end-users, facilitating meaningful patient stratification for optimized precision medicine.

摘要

无监督学习,尤其是聚类,在疾病亚型和患者分层方面发挥着关键作用,尤其是在大规模多组学数据丰富的情况下。深度学习模型,如变分自动编码器(VAEs),可以通过利用个体间的异质性来增强聚类算法。然而,混杂因素(与疾病无关的外部因素,例如批次效应或年龄)对聚类的影响往往被忽视,从而引入偏差和虚假的生物学结论。在这项工作中,我们引入了四个新的基于 VAE 的去混杂框架,专门用于聚类多组学数据。这些框架有效地减轻了混杂效应,同时保留了真实的生物学模式。所采用的去混杂策略包括(i)去除与混杂因素相关的潜在特征,(ii)条件 VAE,(iii)对抗训练,以及(iv)在损失函数中添加正则化项。我们使用来自癌症基因组图谱的真实多组学数据,模拟了各种混杂效应(线性、非线性、分类、混合),并基于重建误差、聚类稳定性和去混杂效果,在 50 次重复中评估了模型性能。我们的结果表明,我们的新模型,特别是条件多组学 VAE(cXVAE),成功地处理了模拟的混杂效应,并恢复了由生物学驱动的聚类结构。cXVAE 准确地识别了患者标签,并揭示了癌症类型之间有意义的病理关联,验证了去混杂表示的有效性。此外,我们的研究表明,一些提出的策略,如对抗训练,在去除混杂因素方面证明是不够的。总之,我们的研究通过提出用于聚类中同时多组学数据集成、降维和去混杂的创新框架做出了贡献。在开放访问数据上进行基准测试为最终用户提供了指导,有助于为优化精准医学进行有意义的患者分层。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9928/11483139/448be3aad8a4/bbae512f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9928/11483139/ab7c320d8036/bbae512f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9928/11483139/ea9e20344e3c/bbae512f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9928/11483139/5e17e451d9aa/bbae512f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9928/11483139/448be3aad8a4/bbae512f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9928/11483139/ab7c320d8036/bbae512f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9928/11483139/ea9e20344e3c/bbae512f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9928/11483139/5e17e451d9aa/bbae512f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9928/11483139/448be3aad8a4/bbae512f4.jpg

相似文献

[1]
Novel multi-omics deconfounding variational autoencoders can obtain meaningful disease subtyping.

Brief Bioinform. 2024-9-23

[2]
Effective Integration of Single-Cell Multi-Omics Data Using Improved Network-Based Integrative Clustering with Multigraph Regularization.

J Comput Biol. 2025-6

[3]
PartIES: a disease subtyping framework with Partition-level Integration using diffusion-Enhanced Similarities from multi-omics Data.

Brief Bioinform. 2024-11-22

[4]
Comparative analysis of statistical and deep learning-based multi-omics integration for breast cancer subtype classification.

J Transl Med. 2025-7-1

[5]
Short-Term Memory Impairment

2025-1

[6]
Artificial intelligence for diagnosing exudative age-related macular degeneration.

Cochrane Database Syst Rev. 2024-10-17

[7]
Gene regulatory network integration with multi-omics data enhances survival predictions in cancer.

Brief Bioinform. 2025-7-2

[8]
Measures implemented in the school setting to contain the COVID-19 pandemic.

Cochrane Database Syst Rev. 2022-1-17

[9]
Management of urinary stones by experts in stone disease (ESD 2025).

Arch Ital Urol Androl. 2025-6-30

[10]
A deep learning approach based on multi-omics data integration to construct a risk stratification prediction model for skin cutaneous melanoma.

J Cancer Res Clin Oncol. 2023-11

引用本文的文献

[1]
Applications and advances of multi-omics technologies in gastrointestinal tumors.

Front Med (Lausanne). 2025-7-23

本文引用的文献

[1]
A context-aware deconfounding autoencoder for robust prediction of personalized clinical drug response from cell-line compound screening.

Nat Mach Intell. 2022-10

[2]
Cross-modal autoencoder framework learns holistic representations of cardiovascular state.

Nat Commun. 2023-4-28

[3]
How to remove or control confounds in predictive models, with applications to brain biomarkers.

Gigascience. 2022-3-12

[4]
AIME: Autoencoder-based integrative multi-omics data embedding that allows for confounder adjustments.

PLoS Comput Biol. 2022-1

[5]
Evaluation of Epigenetic Age Based on DNA Methylation Analysis of Several CpG Sites in Ukrainian Population.

Front Genet. 2022-1-6

[6]
Circulating proteins and risk of pancreatic cancer: a case-subcohort study among Chinese adults.

Int J Epidemiol. 2022-6-13

[7]
Representation Learning with Statistical Independence to Mitigate Bias.

IEEE Winter Conf Appl Comput Vis. 2021-1

[8]
Adversarial deconfounding autoencoder for learning robust gene expression embeddings.

Bioinformatics. 2020-12-30

[9]
Deep feature extraction of single-cell transcriptomes by generative adversarial network.

Bioinformatics. 2021-6-16

[10]
Multi-omic signatures identify pan-cancer classes of tumors beyond tissue of origin.

Sci Rep. 2020-5-20

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

推荐工具

医学文档翻译智能文献检索