VTrans：一种基于变分自编码器的预训练Transformer方法用于微生物组数据分析。

VTrans: A VAE-Based Pre-Trained Transformer Method for Microbiome Data Analysis.

作者信息

Shi Xinyuan, Zhu Fangfang, Min Wenwen

机构信息

School of Information Science and Engineering, Yunnan University, Kunming, China.

School of Health and Nursing, Yunnan Open University, Kunming, China.

出版信息

J Comput Biol. 2025 Sep;32(9):850-864. doi: 10.1089/cmb.2024.0884. Epub 2025 Apr 28.

DOI:10.1089/cmb.2024.0884

PMID:40295093

Abstract

Predicting the survival outcomes and assessing the risk of patients play a pivotal role in comprehending the microbial composition across various stages of cancer. With the ongoing advancements in deep learning, it has been substantiated that deep learning holds the potential to analyze patient survival risks based on microbial data. However, confronting a common challenge in individual cancer datasets involves the limited sample size and the high dimensionality of the feature space. This predicament often leads to overfitting issues in deep learning models, hindering their ability to effectively extract profound data representations and resulting in suboptimal model performance. To overcome these challenges, we advocate the utilization of pretraining and fine-tuning strategies, which have proven effective in addressing the constraint of having a smaller sample size in individual cancer datasets. In this study, we propose a deep learning model that amalgamates Transformer encoder and variational autoencoder (VAE), VTrans, employing both pre-training and fine-tuning strategies to predict the survival risk of cancer patients using microbial data. Furthermore, we highlight the potential of extending VTrans to integrate microbial multi-omics data. Our method is assessed on three distinct cancer datasets from The Cancer Genome Atlas Program, and the research findings demonstrated that (1) VTrans excels in terms of performance compared to conventional machine learning and other deep learning models. (2) The utilization of pretraning significantly enhances its performance. (3) In contrast to positional encoding, employing VAE encoding proves to be more effective in enriching data representation. (4) Using the idea of saliency map, it is possible to observe which microbes have a high contribution to the classification results. These results demonstrate the effectiveness of VTrans in prediting patient survival risk. Source code and all datasets used in this paper are available at https://github.com/wenwenmin/VTrans and https://doi.org/10.5281/zenodo.14166580.

摘要

预测患者的生存结果并评估其风险在理解癌症各个阶段的微生物组成方面起着关键作用。随着深度学习的不断发展，已经证实深度学习有潜力基于微生物数据来分析患者的生存风险。然而，在单个癌症数据集中面临的一个常见挑战是样本量有限以及特征空间的高维度。这种困境常常导致深度学习模型出现过拟合问题，阻碍它们有效提取深度数据表示的能力，从而导致模型性能欠佳。为了克服这些挑战，我们提倡使用预训练和微调策略，这些策略已被证明在解决单个癌症数据集中样本量较小的限制方面是有效的。在本研究中，我们提出了一种深度学习模型，该模型融合了Transformer编码器和变分自编码器（VAE），即VTrans，采用预训练和微调策略，利用微生物数据来预测癌症患者的生存风险。此外，我们强调了扩展VTrans以整合微生物多组学数据的潜力。我们的方法在来自癌症基因组图谱计划的三个不同癌症数据集上进行了评估，研究结果表明：（1）与传统机器学习和其他深度学习模型相比，VTrans在性能方面表现出色。（2）预训练的使用显著提高了其性能。（3）与位置编码相比，采用VAE编码在丰富数据表示方面更有效。（4）利用显著性图的概念，可以观察到哪些微生物对分类结果有高贡献。这些结果证明了VTrans在预测患者生存风险方面的有效性。本文使用的源代码和所有数据集可在https://github.com/wenwenmin/VTrans和https://doi.org/10.5281/zenodo.14166580获取。

相似文献

VTrans: A VAE-Based Pre-Trained Transformer Method for Microbiome Data Analysis.

J Comput Biol. 2025 Sep;32(9):850-864. doi: 10.1089/cmb.2024.0884. Epub 2025 Apr 28.

Prescription of Controlled Substances: Benefits and Risks

Cognitive decline assessment using semantic linguistic content and transformer deep learning architecture.

Int J Lang Commun Disord. 2024 May-Jun;59(3):1110-1127. doi: 10.1111/1460-6984.12973. Epub 2023 Nov 16.

Trajectory-Ordered Objectives for Self-Supervised Representation Learning of Temporal Healthcare Data Using Transformers: Model Development and Evaluation Study.

JMIR Med Inform. 2025 Jun 4;13:e68138. doi: 10.2196/68138.

Comparison of Two Modern Survival Prediction Tools, SORG-MLA and METSSS, in Patients With Symptomatic Long-bone Metastases Who Underwent Local Treatment With Surgery Followed by Radiotherapy and With Radiotherapy Alone.

Clin Orthop Relat Res. 2024 Dec 1;482(12):2193-2208. doi: 10.1097/CORR.0000000000003185. Epub 2024 Jul 23.

Systemic pharmacological treatments for chronic plaque psoriasis: a network meta-analysis.

Cochrane Database Syst Rev. 2021 Apr 19;4(4):CD011535. doi: 10.1002/14651858.CD011535.pub4.

Short-Term Memory Impairment

Systemic pharmacological treatments for chronic plaque psoriasis: a network meta-analysis.

Cochrane Database Syst Rev. 2020 Jan 9;1(1):CD011535. doi: 10.1002/14651858.CD011535.pub3.

Actor critic with experience replay-based automatic treatment planning for prostate cancer intensity modulated radiotherapy.

Med Phys. 2025 Jul;52(7):e17915. doi: 10.1002/mp.17915. Epub 2025 May 31.

Systemic pharmacological treatments for chronic plaque psoriasis: a network meta-analysis.

Cochrane Database Syst Rev. 2017 Dec 22;12(12):CD011535. doi: 10.1002/14651858.CD011535.pub2.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

VTrans：一种基于变分自编码器的预训练Transformer方法用于微生物组数据分析。

VTrans: A VAE-Based Pre-Trained Transformer Method for Microbiome Data Analysis.

作者信息

Shi Xinyuan, Zhu Fangfang, Min Wenwen

机构信息

School of Information Science and Engineering, Yunnan University, Kunming, China.

School of Health and Nursing, Yunnan Open University, Kunming, China.

出版信息

J Comput Biol. 2025 Sep;32(9):850-864. doi: 10.1089/cmb.2024.0884. Epub 2025 Apr 28.

DOI:10.1089/cmb.2024.0884

PMID:40295093

Abstract

摘要

VTrans：一种基于变分自编码器的预训练Transformer方法用于微生物组数据分析。

VTrans: A VAE-Based Pre-Trained Transformer Method for Microbiome Data Analysis.

作者信息

机构信息

出版信息

相似文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

VTrans：一种基于变分自编码器的预训练Transformer方法用于微生物组数据分析。

VTrans: A VAE-Based Pre-Trained Transformer Method for Microbiome Data Analysis.

作者信息

机构信息

出版信息

相似文献