Liu Zhentao, Chiu Yu-Chiao, Chen Yidong, Huang Yufei
Department of Electrical and Computer, University of Pittsburgh, Pittsburgh, PA 15260, USA.
Cancer Virology Program, UPMC Hillman Cancer Center, Pittsburgh, PA 15232, USA.
Cancers (Basel). 2024 Apr 25;16(9):1653. doi: 10.3390/cancers16091653.
Despite significant advances in tumor biology and clinical therapeutics, metastasis remains the primary cause of cancer-related deaths. While RNA-seq technology has been used extensively to study metastatic cancer characteristics, challenges persist in acquiring adequate transcriptomic data. To overcome this challenge, we propose MetGen, a generative contrastive learning tool based on a deep learning model. MetGen generates synthetic metastatic cancer expression profiles using primary cancer and normal tissue expression data. Our results demonstrate that MetGen generates comparable samples to actual metastatic cancer samples, and the cancer and tissue classification yields performance rates of 99.8 ± 0.2% and 95.0 ± 2.3%, respectively. A benchmark analysis suggests that the proposed model outperforms traditional generative models such as the variational autoencoder. In metastatic subtype classification, our generated samples show 97.6% predicting power compared to true metastatic samples. Additionally, we demonstrate MetGen's interpretability using metastatic prostate cancer and metastatic breast cancer. MetGen has learned highly relevant signatures in cancer, tissue, and tumor microenvironments, such as immune responses and the metastasis process, which can potentially foster a more comprehensive understanding of metastatic cancer biology. The development of MetGen represents a significant step toward the study of metastatic cancer biology by providing a generative model that identifies candidate therapeutic targets for the treatment of metastatic cancer.
尽管肿瘤生物学和临床治疗学取得了重大进展,但转移仍然是癌症相关死亡的主要原因。虽然RNA测序技术已被广泛用于研究转移性癌症的特征,但在获取足够的转录组数据方面仍然存在挑战。为了克服这一挑战,我们提出了MetGen,一种基于深度学习模型的生成性对比学习工具。MetGen使用原发性癌症和正常组织表达数据生成合成转移性癌症表达谱。我们的结果表明,MetGen生成的样本与实际转移性癌症样本相当,癌症和组织分类的准确率分别为99.8±0.2%和95.0±2.3%。基准分析表明,所提出的模型优于传统的生成模型,如变分自编码器。在转移性亚型分类中,与真实的转移性样本相比,我们生成的样本显示出97.6%的预测能力。此外,我们使用转移性前列腺癌和转移性乳腺癌展示了MetGen的可解释性。MetGen在癌症、组织和肿瘤微环境中学习到了高度相关的特征,如免疫反应和转移过程,这可能有助于更全面地理解转移性癌症生物学。MetGen的开发代表了通过提供一种识别转移性癌症治疗候选靶点的生成模型,在转移性癌症生物学研究方面迈出了重要一步。