文献检索文档翻译深度研究
Suppr Zotero 插件Zotero 插件
邀请有礼套餐&价格历史记录

新学期,新优惠

限时优惠:9月1日-9月22日

30天高级会员仅需29元

1天体验卡首发特惠仅需5.99元

了解详情
不再提醒
插件&应用
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
高级版
套餐订阅购买积分包
AI 工具
文献检索文档翻译深度研究
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2025

基于基因表达的癌症类型预测卷积神经网络模型。

Convolutional neural network models for cancer type prediction based on gene expression.

机构信息

Greehey Children's Cancer Research Institute, University of Texas Health San Antonio, San Antonio, TX, 78229, USA.

Department of Electrical and Computer Engineering, University of Texas at San Antonio, San Antonio, TX, 78249, USA.

出版信息

BMC Med Genomics. 2020 Apr 3;13(Suppl 5):44. doi: 10.1186/s12920-020-0677-2.


DOI:10.1186/s12920-020-0677-2
PMID:32241303
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7119277/
Abstract

BACKGROUND: Precise prediction of cancer types is vital for cancer diagnosis and therapy. Through a predictive model, important cancer marker genes can be inferred. Several studies have attempted to build machine learning models for this task however none has taken into consideration the effects of tissue of origin that can potentially bias the identification of cancer markers. RESULTS: In this paper, we introduced several Convolutional Neural Network (CNN) models that take unstructured gene expression inputs to classify tumor and non-tumor samples into their designated cancer types or as normal. Based on different designs of gene embeddings and convolution schemes, we implemented three CNN models: 1D-CNN, 2D-Vanilla-CNN, and 2D-Hybrid-CNN. The models were trained and tested on gene expression profiles from combined 10,340 samples of 33 cancer types and 713 matched normal tissues of The Cancer Genome Atlas (TCGA). Our models achieved excellent prediction accuracies (93.9-95.0%) among 34 classes (33 cancers and normal). Furthermore, we interpreted one of the models, 1D-CNN model, with a guided saliency technique and identified a total of 2090 cancer markers (108 per class on average). The concordance of differential expression of these markers between the cancer type they represent and others is confirmed. In breast cancer, for instance, our model identified well-known markers, such as GATA3 and ESR1. Finally, we extended the 1D-CNN model for the prediction of breast cancer subtypes and achieved an average accuracy of 88.42% among 5 subtypes. The codes can be found at https://github.com/chenlabgccri/CancerTypePrediction. CONCLUSIONS: Here we present novel CNN designs for accurate and simultaneous cancer/normal and cancer types prediction based on gene expression profiles, and unique model interpretation scheme to elucidate biologically relevance of cancer marker genes after eliminating the effects of tissue-of-origin. The proposed model has light hyperparameters to be trained and thus can be easily adapted to facilitate cancer diagnosis in the future.

摘要

背景:精确预测癌症类型对于癌症诊断和治疗至关重要。通过预测模型,可以推断出重要的癌症标记基因。已经有几项研究试图为此任务构建机器学习模型,但没有考虑到组织起源的影响,而组织起源可能会影响癌症标志物的识别。

结果:在本文中,我们引入了几种卷积神经网络 (CNN) 模型,这些模型采用非结构化基因表达输入,将肿瘤和非肿瘤样本分类为指定的癌症类型或正常。基于基因嵌入和卷积方案的不同设计,我们实现了三种 CNN 模型:1D-CNN、2D-Vanilla-CNN 和 2D-Hybrid-CNN。这些模型在来自癌症基因组图谱 (TCGA) 的 33 种癌症和 713 个匹配正常组织的 10340 个样本的基因表达谱上进行了训练和测试。我们的模型在 34 个类别(33 种癌症和正常)中实现了优异的预测准确性(93.9-95.0%)。此外,我们使用一种引导式显著性技术对其中一个模型(1D-CNN 模型)进行了解释,共鉴定出 2090 个癌症标记物(平均每个类别 108 个)。这些标记物在它们所代表的癌症类型和其他癌症类型之间的差异表达的一致性得到了确认。例如,在乳腺癌中,我们的模型鉴定了 GATA3 和 ESR1 等知名标记物。最后,我们扩展了 1D-CNN 模型,用于预测乳腺癌亚型,在 5 个亚型中平均准确率为 88.42%。代码可在 https://github.com/chenlabgccri/CancerTypePrediction 上找到。

结论:在这里,我们提出了基于基因表达谱的新型 CNN 设计,用于准确和同时进行癌症/正常和癌症类型预测,以及独特的模型解释方案,用于在消除组织起源影响后阐明癌症标记基因的生物学相关性。所提出的模型具有轻量级的超参数,可以进行训练,因此可以很容易地适应未来的癌症诊断。

相似文献

[1]
Convolutional neural network models for cancer type prediction based on gene expression.

BMC Med Genomics. 2020-4-3

[2]
Classification of Cancer Types Using Graph Convolutional Neural Networks.

Front Phys. 2020-6

[3]
CUP-AI-Dx: A tool for inferring cancer tissue of origin and molecular subtype using RNA gene-expression data and artificial intelligence.

EBioMedicine. 2020-11

[4]
Network-based drug sensitivity prediction.

BMC Med Genomics. 2020-12-28

[5]
A convolutional neural network model for survival prediction based on prognosis-related cascaded Wx feature selection.

Lab Invest. 2022-10

[6]
Convolutional neural network for human cancer types prediction by integrating protein interaction networks and omics data.

Sci Rep. 2021-10-19

[7]
Deep Convolutional Neural Networks Enable Discrimination of Heterogeneous Digital Pathology Images.

EBioMedicine. 2017-12-28

[8]
A deep dive into understanding tumor foci classification using multiparametric MRI based on convolutional neural network.

Med Phys. 2020-9

[9]
Automatic extraction of cancer registry reportable information from free-text pathology reports using multitask convolutional neural networks.

J Am Med Inform Assoc. 2020-1-1

[10]
CNN-MGP: Convolutional Neural Networks for Metagenomics Gene Prediction.

Interdiscip Sci. 2018-12-27

引用本文的文献

[1]
HallmarkGraph: a cancer hallmark informed graph neural network for classifying hierarchical tumor subtypes.

Bioinformatics. 2025-9-1

[2]
Application of deep learning models in gastric cancer pathology image analysis: a systematic scoping review.

BMC Cancer. 2025-8-1

[3]
Interpretable graph Kolmogorov-Arnold networks for multi-cancer classification and biomarker identification using multi-omics data.

Sci Rep. 2025-7-29

[4]
Semi-supervised data-integrated feature importance enhances performance and interpretability of biological classification tasks.

Bioinformatics. 2025-7-1

[5]
Explainable AI Model Reveals Informative Mutational Signatures for Cancer-Type Classification.

Cancers (Basel). 2025-5-22

[6]
Comparative Analysis of Multi-Omics Integration Using Graph Neural Networks for Cancer Classification.

IEEE Access. 2025

[7]
Deep Learning-Assisted Diagnostic System: Apices and Odontogenic Sinus Floor Level Analysis in Dental Panoramic Radiographs.

Bioengineering (Basel). 2025-1-30

[8]
Spatially distinct cellular and molecular landscapes define prognosis in triple negative breast cancer.

bioRxiv. 2025-2-12

[9]
Cellular Senescence in Hepatocellular Carcinoma: Immune Microenvironment Insights via Machine Learning and In Vitro Experiments.

Int J Mol Sci. 2025-1-17

[10]
The development of an efficient artificial intelligence-based classification approach for colorectal cancer response to radiochemotherapy: deep learning vs. machine learning.

Sci Rep. 2025-1-2

本文引用的文献

[1]
Deep learning of pharmacogenomics resources: moving towards precision oncology.

Brief Bioinform. 2020-12-1

[2]
deepDriver: Predicting Cancer Driver Genes Based on Somatic Mutations Using Deep Convolutional Neural Networks.

Front Genet. 2019-1-29

[3]
Predicting drug response of tumors from integrated genomic profiles by deep neural networks.

BMC Med Genomics. 2019-1-31

[4]
GSAE: an autoencoder with embedded gene-set nodes for genomics functional characterization.

BMC Syst Biol. 2018-12-21

[5]
Cancer type prediction based on copy number aberration and chromatin 3D structure with convolutional neural networks.

BMC Genomics. 2018-8-13

[6]
Classification, Ontology, and Precision Medicine.

N Engl J Med. 2018-10-11

[7]
GeneCT: a generalizable cancerous status and tissue origin classifier for pan-cancer biopsies.

Bioinformatics. 2018-12-1

[8]
Detection and localization of surgically resectable cancers with a multi-analyte blood test.

Science. 2018-2-23

[9]
Cancer statistics, 2018.

CA Cancer J Clin. 2018-1-4

[10]
Genetic effects on gene expression across human tissues.

Nature. 2017-10-11

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

推荐工具

医学文档翻译智能文献检索