利用RNA测序数据发现三阴性乳腺癌生物标志物的卷积神经网络

Convolutional neural network for biomarker discovery for triple negative breast cancer with RNA sequencing data.

作者信息

Chen Xiangning, Balko Justin M, Ling Fei, Jin Yabin, Gonzalez Anneliese, Zhao Zhongming, Chen Jingchun

机构信息

410 AI, LLC, 10 Plummer Ct, Germantown, MD, 20876, USA.

Department of Medicine, Vanderbilt-Ingram Cancer Center, Vanderbilt University Medical Center, 2101 W End Ave, Nashville, TN, 37240, USA.

出版信息

Heliyon. 2023 Mar 23;9(4):e14819. doi: 10.1016/j.heliyon.2023.e14819. eCollection 2023 Apr.

DOI:10.1016/j.heliyon.2023.e14819

PMID:37025902

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10070674/

Abstract

Triple negative breast cancers (TNBCs) are tumors with a poor treatment response and prognosis. In this study, we propose a new approach, candidate extraction from convolutional neural network (CNN) elements (CECE), for discovery of biomarkers for TNBCs. We used the GSE96058 and GSE81538 datasets to build a CNN model to classify TNBCs and non-TNBCs and used the model to make TNBC predictions for two additional datasets, the cancer genome atlas (TCGA) breast cancer RNA sequencing data and the data from Fudan University Shanghai Cancer Center (FUSCC). Using correctly predicted TNBCs from the GSE96058 and TCGA datasets, we calculated saliency maps for these subjects and extracted the genes that the CNN model used to separate TNBCs from non-TNBCs. Among the TNBC signature patterns that the CNN models learned from the training data, we found a set of 21 genes that can classify TNBCs into two major classes, or CECE subtypes, with distinct overall survival rates ( = 0.0074). We replicated this subtype classification in the FUSCC dataset using the same 21 genes, and the two subtypes had similar differential overall survival rates ( = 0.0490). When all TNBCs were combined from the 3 datasets, the CECE II subtype had a hazard ratio of 1.94 (95% CI, 1.25-3.01; = 0.0032). The results demonstrate that the spatial patterns learned by the CNN models can be utilized to discover interacting biomarkers otherwise unlikely to be identified by traditional approaches.

摘要

三阴性乳腺癌（TNBC）是治疗反应和预后较差的肿瘤。在本研究中，我们提出了一种新方法，即从卷积神经网络（CNN）元素中提取候选物（CECE），用于发现TNBC的生物标志物。我们使用GSE96058和GSE81538数据集构建了一个CNN模型来区分TNBC和非TNBC，并使用该模型对另外两个数据集进行TNBC预测，这两个数据集分别是癌症基因组图谱（TCGA）乳腺癌RNA测序数据和复旦大学附属肿瘤医院（FUSCC）的数据。利用GSE96058和TCGA数据集中预测正确的TNBC，我们计算了这些样本的显著性图，并提取了CNN模型用于区分TNBC和非TNBC的基因。在CNN模型从训练数据中学到的TNBC特征模式中，我们发现一组21个基因可以将TNBC分为两个主要类别，即CECE亚型，其总生存率明显不同（P = 0.0074）。我们使用相同的21个基因在FUSCC数据集中重复了这种亚型分类，并且这两个亚型具有相似的总生存率差异（P = 0.0490）。当将3个数据集中的所有TNBC合并时，CECE II亚型的风险比为1.94（95% CI，1.25 - 3.01；P = 0.0032）。结果表明，CNN模型学到的空间模式可用于发现相互作用的生物标志物，而这些生物标志物用传统方法不太可能被识别。

相似文献

Convolutional neural network for biomarker discovery for triple negative breast cancer with RNA sequencing data.利用RNA测序数据发现三阴性乳腺癌生物标志物的卷积神经网络

Heliyon. 2023 Mar 23;9(4):e14819. doi: 10.1016/j.heliyon.2023.e14819. eCollection 2023 Apr.

Molecular Subtyping of Triple-Negative Breast Cancers by Immunohistochemistry: Molecular Basis and Clinical Relevance.三阴性乳腺癌的免疫组化分子分型：分子基础与临床相关性。

Oncologist. 2020 Oct;25(10):e1481-e1491. doi: 10.1634/theoncologist.2019-0982. Epub 2020 Jun 1.

High expression of TLR3 in triple-negative breast cancer predicts better prognosis-data from the Fudan University Shanghai Cancer Center cohort and tissue microarrays.TLR3 在三阴性乳腺癌中的高表达预示着更好的预后——来自复旦大学上海癌症中心队列和组织微阵列的数据。

BMC Cancer. 2023 Apr 1;23(1):298. doi: 10.1186/s12885-023-10721-9.

Artificial image objects for classification of breast cancer biomarkers with transcriptome sequencing data and convolutional neural network algorithms.基于转录组测序数据和卷积神经网络算法的乳腺癌生物标志物分类的人工图像目标。

Breast Cancer Res. 2021 Oct 10;23(1):96. doi: 10.1186/s13058-021-01474-z.

Association Between Genomic Metrics and Immune Infiltration in Triple-Negative Breast Cancer.三阴性乳腺癌中基因组指标与免疫浸润的相关性研究。

JAMA Oncol. 2017 Dec 1;3(12):1707-1711. doi: 10.1001/jamaoncol.2017.2140.

DEGnext: classification of differentially expressed genes from RNA-seq data using a convolutional neural network with transfer learning.DEGnext：使用具有迁移学习的卷积神经网络对 RNA-seq 数据进行差异表达基因分类。

BMC Bioinformatics. 2022 Jan 6;23(1):17. doi: 10.1186/s12859-021-04527-4.

Co-Expression and Combined Prognostic Value of CSPG4 and PDL1 in -Aberrant Triple-Negative Breast Cancer.硫酸软骨素蛋白聚糖4（CSPG4）和程序性死亡受体1（PDL1）在异常三阴性乳腺癌中的共表达及联合预后价值

Front Oncol. 2022 Feb 24;12:804466. doi: 10.3389/fonc.2022.804466. eCollection 2022.

High frequency of p16 and SOX10 coexpression but not androgen receptor expression in triple-negative breast cancers.三阴性乳腺癌中 p16 和 SOX10 共表达频率高，但雄激素受体表达频率低。

Hum Pathol. 2020 Aug;102:13-22. doi: 10.1016/j.humpath.2020.06.004. Epub 2020 Jun 18.

Convolutional neural network models for cancer type prediction based on gene expression.基于基因表达的癌症类型预测卷积神经网络模型。

BMC Med Genomics. 2020 Apr 3;13(Suppl 5):44. doi: 10.1186/s12920-020-0677-2.

A Machine Learning Model to Predict the Triple Negative Breast Cancer Immune Subtype.一种用于预测三阴性乳腺癌免疫亚型的机器学习模型。

Front Immunol. 2021 Sep 17;12:749459. doi: 10.3389/fimmu.2021.749459. eCollection 2021.

本文引用的文献

Breast Cancer Res. 2021 Oct 10;23(1):96. doi: 10.1186/s13058-021-01474-z.

Artificial image objects for classification of schizophrenia with GWAS-selected SNVs and convolutional neural network.用于通过全基因组关联研究选择的单核苷酸变异和卷积神经网络对精神分裂症进行分类的人工图像对象

Patterns (N Y). 2021 Jun 30;2(8):100303. doi: 10.1016/j.patter.2021.100303. eCollection 2021 Aug 13.

Identification of Hub Genes to Regulate Breast Cancer Spinal Metastases by Bioinformatics Analyses.基于生物信息学分析鉴定调控乳腺癌脊柱转移的枢纽基因。

Comput Math Methods Med. 2021 May 12;2021:5548918. doi: 10.1155/2021/5548918. eCollection 2021.

MicroRNA-496 inhibits triple negative breast cancer cell proliferation by targeting Del-1.微小 RNA-496 通过靶向 Del-1 抑制三阴性乳腺癌细胞增殖。

Medicine (Baltimore). 2021 Apr 9;100(14):e25270. doi: 10.1097/MD.0000000000025270.

Triple-negative breast cancer: promising prognostic biomarkers currently in development.三阴性乳腺癌：目前正在开发的有前途的预后生物标志物。

Expert Rev Anticancer Ther. 2021 Feb;21(2):135-148. doi: 10.1080/14737140.2021.1840984.

Platinum-based chemotherapy in combination with PD-1/PD-L1 inhibitors: preclinical and clinical studies and mechanism of action.铂类化疗联合 PD-1/PD-L1 抑制剂：临床前和临床研究及作用机制。

Expert Opin Drug Deliv. 2021 Feb;18(2):187-203. doi: 10.1080/17425247.2021.1825376. Epub 2020 Oct 5.

Clinical Value of RNA Sequencing-Based Classifiers for Prediction of the Five Conventional Breast Cancer Biomarkers: A Report From the Population-Based Multicenter Sweden Cancerome Analysis Network-Breast Initiative.基于RNA测序的分类器对五种传统乳腺癌生物标志物预测的临床价值：来自基于人群的多中心瑞典癌症基因组分析网络-乳腺癌倡议的报告

JCO Precis Oncol. 2018 Mar 9;2. doi: 10.1200/PO.17.00135. eCollection 2018.

Perspectives on Triple-Negative Breast Cancer: Current Treatment Strategies, Unmet Needs, and Potential Targets for Future Therapies.三阴性乳腺癌的展望：当前治疗策略、未满足的需求及未来治疗的潜在靶点

Cancers (Basel). 2020 Aug 24;12(9):2392. doi: 10.3390/cancers12092392.

Clinical value and potential mechanisms of COL8A1 upregulation in breast cancer: a comprehensive analysis.COL8A1在乳腺癌中上调的临床价值及潜在机制：一项综合分析

Cancer Cell Int. 2020 Aug 14;20:392. doi: 10.1186/s12935-020-01465-8. eCollection 2020.

Triple-Negative Breast Cancer: A Review of Conventional and Advanced Therapeutic Strategies.三阴性乳腺癌：常规和先进治疗策略的综述。

Int J Environ Res Public Health. 2020 Mar 20;17(6):2078. doi: 10.3390/ijerph17062078.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验