基于基因表达值预测癌症预后的深度对比学习。

Deep contrastive learning for predicting cancer prognosis using gene expression values.

机构信息

Department of Electrical and Computer Engineering, University of Miami, Miami, FL 33146, United States.

Department of Otolaryngology, University of Miami, Miami, FL 33146, United States.

出版信息

Brief Bioinform. 2024 Sep 23;25(6). doi: 10.1093/bib/bbae544.

DOI:10.1093/bib/bbae544

PMID:39471411

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11521346/

Abstract

Recent advancements in image classification have demonstrated that contrastive learning (CL) can aid in further learning tasks by acquiring good feature representation from a limited number of data samples. In this paper, we applied CL to tumor transcriptomes and clinical data to learn feature representations in a low-dimensional space. We then utilized these learned features to train a classifier to categorize tumors into a high- or low-risk group of recurrence. Using data from The Cancer Genome Atlas (TCGA), we demonstrated that CL can significantly improve classification accuracy. Specifically, our CL-based classifiers achieved an area under the receiver operating characteristic curve (AUC) greater than 0.8 for 14 types of cancer, and an AUC greater than 0.9 for 3 types of cancer. We also developed CL-based Cox (CLCox) models for predicting cancer prognosis. Our CLCox models trained with the TCGA data outperformed existing methods significantly in predicting the prognosis of 19 types of cancer under consideration. The performance of CLCox models and CL-based classifiers trained with TCGA lung and prostate cancer data were validated using the data from two independent cohorts. We also show that the CLCox model trained with the whole transcriptome significantly outperforms the Cox model trained with the 16 genes of Oncotype DX that is in clinical use for breast cancer patients. The trained models and the Python codes are publicly accessible and provide a valuable resource that will potentially find clinical applications for many types of cancer.

摘要

最近在图像分类方面的进展表明，对比学习（CL）可以通过从有限数量的数据样本中获取良好的特征表示，来帮助进一步学习任务。在本文中，我们将 CL 应用于肿瘤转录组和临床数据，以在低维空间中学习特征表示。然后，我们利用这些学习到的特征来训练分类器，将肿瘤分为高复发风险或低复发风险组。使用来自癌症基因组图谱（TCGA）的数据，我们证明 CL 可以显著提高分类准确性。具体来说，我们基于 CL 的分类器在 14 种癌症类型中实现了大于 0.8 的接收器操作特征曲线（AUC）下面积，在 3 种癌症类型中实现了大于 0.9 的 AUC。我们还开发了基于 CL 的 Cox（CLCox）模型来预测癌症预后。我们使用 TCGA 数据训练的 CLCox 模型在预测所考虑的 19 种癌症的预后方面明显优于现有方法。使用来自两个独立队列的数据验证了基于 TCGA 肺和前列腺癌数据训练的 CLCox 模型和基于 CL 的分类器的性能。我们还表明，使用整个转录组训练的 CLCox 模型明显优于临床用于乳腺癌患者的 Oncotype DX 的 16 个基因训练的 Cox 模型。训练好的模型和 Python 代码是公开可用的，为许多类型的癌症提供了有价值的资源，可能会找到临床应用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ea7b/11521346/1c015920d156/bbae544f1.jpg

相似文献

Deep contrastive learning for predicting cancer prognosis using gene expression values.

Brief Bioinform. 2024 Sep 23;25(6). doi: 10.1093/bib/bbae544.

Deep learning-based cancer survival prognosis from RNA-seq data: approaches and evaluations.

BMC Med Genomics. 2020 Apr 3;13(Suppl 5):41. doi: 10.1186/s12920-020-0686-1.

A machine learning classifier trained on cancer transcriptomes detects NF1 inactivation signal in glioblastoma.

BMC Genomics. 2017 Feb 6;18(1):127. doi: 10.1186/s12864-017-3519-7.

Identification of biomarkers associated with diagnosis and prognosis of colorectal cancer patients based on integrated bioinformatics analysis.

Gene. 2019 Apr 15;692:119-125. doi: 10.1016/j.gene.2019.01.001. Epub 2019 Jan 14.

Enhancing the prediction of IDC breast cancer staging from gene expression profiles using hybrid feature selection methods and deep learning architecture.

Med Biol Eng Comput. 2023 Nov;61(11):2895-2919. doi: 10.1007/s11517-023-02892-1. Epub 2023 Aug 2.

Identification of Breast Cancer Subtypes Based on Endoplasmic Reticulum Stress-Related Genes and Analysis of Prognosis and Immune Microenvironment in Breast Cancer Patients.

Technol Cancer Res Treat. 2024 Jan-Dec;23:15330338241241484. doi: 10.1177/15330338241241484.

Development and Validation of an Individualized Immune Prognostic Signature for Recurrent Prostate Cancer.

Comb Chem High Throughput Screen. 2021;24(1):98-108. doi: 10.2174/1386207323666200627212820.

The Combined Detection of Immune Genes for Predicting the Prognosis of Patients With Non-Small Cell Lung Cancer.

Technol Cancer Res Treat. 2020 Jan-Dec;19:1533033820977504. doi: 10.1177/1533033820977504.

Construction and Validation of a Prognostic Gene-Based Model for Overall Survival Prediction in Hepatocellular Carcinoma Using an Integrated Statistical and Bioinformatic Approach.

Int J Mol Sci. 2021 Feb 5;22(4):1632. doi: 10.3390/ijms22041632.

Convolutional neural network models for cancer type prediction based on gene expression.

BMC Med Genomics. 2020 Apr 3;13(Suppl 5):44. doi: 10.1186/s12920-020-0677-2.

引用本文的文献

Prognostic value and immune infiltration of novel markers TNRC6C/AMPD1 in pancreatic cancer microenvironment.

Biochem Biophys Rep. 2025 Aug 4;43:102185. doi: 10.1016/j.bbrep.2025.102185. eCollection 2025 Sep.

High tumor CD161 expression predicts a survival advantage and marks a Th1-skewed microenvironment.

Front Immunol. 2025 Mar 17;16:1522755. doi: 10.3389/fimmu.2025.1522755. eCollection 2025.

PCLSurv: a prototypical contrastive learning-based multi-omics data integration model for cancer survival prediction.

Brief Bioinform. 2025 Mar 4;26(2). doi: 10.1093/bib/bbaf124.

本文引用的文献

Systematic analysis of transcriptome signature for improving outcomes in lung adenocarcinoma.

J Cancer Res Clin Oncol. 2023 Sep;149(11):8951-8968. doi: 10.1007/s00432-023-04814-y. Epub 2023 May 9.

Comprehensive Evaluation of Machine Learning Models and Gene Expression Signatures for Prostate Cancer Prognosis Using Large Population Cohorts.

Cancer Res. 2022 May 3;82(9):1832-1843. doi: 10.1158/0008-5472.CAN-21-3074.

The clinical relevance of gene expression based prognostic signatures in colorectal cancer.

Biochim Biophys Acta Rev Cancer. 2021 Apr;1875(2):188513. doi: 10.1016/j.bbcan.2021.188513. Epub 2021 Jan 22.

A meta-learning approach for genomic survival analysis.

Nat Commun. 2020 Dec 11;11(1):6350. doi: 10.1038/s41467-020-20167-3.

Improved survival analysis by learning shared genomic information from pan-cancer data.

Bioinformatics. 2020 Jul 1;36(Suppl_1):i389-i398. doi: 10.1093/bioinformatics/btaa462.

DeepBTS: Prediction of Recurrence-free Survival of Non-small Cell Lung Cancer Using a Time-binned Deep Neural Network.

Sci Rep. 2020 Feb 6;10(1):1952. doi: 10.1038/s41598-020-58722-z.

Group Lasso Regularized Deep Learning for Cancer Prognosis from Multi-Omics and Clinical Features.

Genes (Basel). 2019 Mar 21;10(3):240. doi: 10.3390/genes10030240.

Molecular Evolution of Early-Onset Prostate Cancer Identifies Molecular Risk Markers and Clinical Trajectories.

Cancer Cell. 2018 Dec 10;34(6):996-1011.e8. doi: 10.1016/j.ccell.2018.10.016.

Breast cancer prognosis signature: linking risk stratification to disease subtypes.

Brief Bioinform. 2019 Nov 27;20(6):2130-2140. doi: 10.1093/bib/bby073.

Cox-nnet: An artificial neural network method for prognosis prediction of high-throughput omics data.

PLoS Comput Biol. 2018 Apr 10;14(4):e1006076. doi: 10.1371/journal.pcbi.1006076. eCollection 2018 Apr.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

基于基因表达值预测癌症预后的深度对比学习。

Deep contrastive learning for predicting cancer prognosis using gene expression values.

机构信息

Department of Electrical and Computer Engineering, University of Miami, Miami, FL 33146, United States.

Department of Otolaryngology, University of Miami, Miami, FL 33146, United States.

出版信息

Brief Bioinform. 2024 Sep 23;25(6). doi: 10.1093/bib/bbae544.

DOI:10.1093/bib/bbae544

PMID:39471411

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11521346/

Abstract

摘要

基于基因表达值预测癌症预后的深度对比学习。

Deep contrastive learning for predicting cancer prognosis using gene expression values.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

基于基因表达值预测癌症预后的深度对比学习。

Deep contrastive learning for predicting cancer prognosis using gene expression values.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献