• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用DNA体细胞突变数据通过机器学习方法预测癌症组织起源

Predicting Cancer Tissue-of-Origin by a Machine Learning Method Using DNA Somatic Mutation Data.

作者信息

Liu Xiaojun, Li Lianxing, Peng Lihong, Wang Bo, Lang Jidong, Lu Qingqing, Zhang Xizhe, Sun Yi, Tian Geng, Zhang Huajun, Zhou Liqian

机构信息

School of Computer Science, Hunan University of Technology, Zhuzhou, China.

Chifeng Municipal Hospital, Chifeng, China.

出版信息

Front Genet. 2020 Jul 14;11:674. doi: 10.3389/fgene.2020.00674. eCollection 2020.

DOI:10.3389/fgene.2020.00674
PMID:32760423
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7372518/
Abstract

Patients with carcinoma of unknown primary (CUP) account for 3-5% of all cancer cases. A large number of metastatic cancers require further diagnosis to determine their tissue of origin. However, diagnosis of CUP and identification of its primary site are challenging. Previous studies have suggested that molecular profiling of tissue-specific genes could be useful in inferring the primary tissue of a tumor. The purpose of this study was to evaluate the performance somatic mutations detected in a tumor to identify the cancer tissue of origin. We downloaded the somatic mutation datasets from the International Cancer Genome Consortium project. The random forest algorithm was used to extract features, and a classifier was established based on the logistic regression. Specifically, the somatic mutations of 300 genes were extracted, which are significantly enriched in functions, such as cell-to-cell adhesion. In addition, the prediction accuracy on tissue-of-origin inference for 3,374 cancer samples across 13 cancer types reached 81% in a 10-fold cross-validation. Our method could be useful in the identification of cancer tissue of origin, as well as the diagnosis and treatment of cancers.

摘要

原发灶不明癌(CUP)患者占所有癌症病例的3%至5%。大量转移性癌症需要进一步诊断以确定其组织来源。然而,CUP的诊断及其原发部位的识别具有挑战性。先前的研究表明,组织特异性基因的分子谱分析可能有助于推断肿瘤的原发组织。本研究的目的是评估肿瘤中检测到的体细胞突变在识别癌症组织来源方面的性能。我们从国际癌症基因组联盟项目下载了体细胞突变数据集。使用随机森林算法提取特征,并基于逻辑回归建立分类器。具体而言,提取了300个基因的体细胞突变,这些基因在细胞间粘附等功能中显著富集。此外,在10折交叉验证中,对13种癌症类型的3374个癌症样本进行组织来源推断的预测准确率达到了81%。我们的方法可能有助于识别癌症组织来源以及癌症的诊断和治疗。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2329/7372518/15986af0d49e/fgene-11-00674-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2329/7372518/ce2fcc27e8f2/fgene-11-00674-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2329/7372518/3d8a332158ef/fgene-11-00674-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2329/7372518/62ccb4070378/fgene-11-00674-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2329/7372518/e070204a356d/fgene-11-00674-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2329/7372518/15986af0d49e/fgene-11-00674-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2329/7372518/ce2fcc27e8f2/fgene-11-00674-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2329/7372518/3d8a332158ef/fgene-11-00674-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2329/7372518/62ccb4070378/fgene-11-00674-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2329/7372518/e070204a356d/fgene-11-00674-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2329/7372518/15986af0d49e/fgene-11-00674-g005.jpg

相似文献

1
Predicting Cancer Tissue-of-Origin by a Machine Learning Method Using DNA Somatic Mutation Data.使用DNA体细胞突变数据通过机器学习方法预测癌症组织起源
Front Genet. 2020 Jul 14;11:674. doi: 10.3389/fgene.2020.00674. eCollection 2020.
2
Evaluating DNA Methylation, Gene Expression, Somatic Mutation, and Their Combinations in Inferring Tumor Tissue-of-Origin.评估DNA甲基化、基因表达、体细胞突变及其组合在推断肿瘤组织起源中的作用。
Front Cell Dev Biol. 2021 May 3;9:619330. doi: 10.3389/fcell.2021.619330. eCollection 2021.
3
A machine learning framework to trace tumor tissue-of-origin of 13 types of cancer based on DNA somatic mutation.一种基于 DNA 体细胞突变追踪 13 种癌症肿瘤组织起源的机器学习框架。
Biochim Biophys Acta Mol Basis Dis. 2020 Nov 1;1866(11):165916. doi: 10.1016/j.bbadis.2020.165916. Epub 2020 Aug 7.
4
TOOme: A Novel Computational Framework to Infer Cancer Tissue-of-Origin by Integrating Both Gene Mutation and Expression.TOOme:一种通过整合基因突变和表达来推断癌症组织起源的新型计算框架。
Front Bioeng Biotechnol. 2020 May 19;8:394. doi: 10.3389/fbioe.2020.00394. eCollection 2020.
5
CUP-AI-Dx: A tool for inferring cancer tissue of origin and molecular subtype using RNA gene-expression data and artificial intelligence.CUP-AI-Dx:一种使用 RNA 基因表达数据和人工智能推断癌症组织来源和分子亚型的工具。
EBioMedicine. 2020 Nov;61:103030. doi: 10.1016/j.ebiom.2020.103030. Epub 2020 Oct 9.
6
Deep Learning-Based Identification of Tissue of Origin for Carcinomas of Unknown Primary Using MicroRNA Expression: Algorithm Development and Validation.基于深度学习利用微小RNA表达鉴定原发性不明癌的组织来源:算法开发与验证
JMIR Bioinform Biotechnol. 2024 Jul 24;5:e56538. doi: 10.2196/56538.
7
Machine learning-based tissue of origin classification for cancer of unknown primary diagnostics using genome-wide mutation features.基于机器学习的基因组突变特征对原发灶不明癌症的组织起源分类诊断。
Nat Commun. 2022 Jul 11;13(1):4013. doi: 10.1038/s41467-022-31666-w.
8
TumorTracer: a method to identify the tissue of origin from the somatic mutations of a tumor specimen.肿瘤追踪器:一种从肿瘤标本的体细胞突变中识别肿瘤起源组织的方法。
BMC Med Genomics. 2015 Oct 1;8:58. doi: 10.1186/s12920-015-0130-0.
9
TOD-CUP: a gene expression rank-based majority vote algorithm for tissue origin diagnosis of cancers of unknown primary.TOD-CUP:一种基于基因表达排序的多数投票算法,用于诊断不明原发灶癌症的组织来源。
Brief Bioinform. 2021 Mar 22;22(2):2106-2118. doi: 10.1093/bib/bbaa031.
10
The practical utility of AI-assisted molecular profiling in the diagnosis and management of cancer of unknown primary: an updated review.人工智能辅助分子谱分析在不明原发癌的诊断和治疗中的实际应用:最新综述。
Virchows Arch. 2024 Feb;484(2):369-375. doi: 10.1007/s00428-023-03708-1. Epub 2023 Nov 24.

引用本文的文献

1
GPSai: A Clinically Validated AI Tool for Tissue of Origin Prediction during Routine Tumor Profiling.GPSai:一种在常规肿瘤分析中用于预测组织起源的经过临床验证的人工智能工具。
Cancer Res Commun. 2025 Sep 1;5(9):1477-1489. doi: 10.1158/2767-9764.CRC-25-0171.
2
Explainable AI Model Reveals Informative Mutational Signatures for Cancer-Type Classification.可解释人工智能模型揭示用于癌症类型分类的信息性突变特征。
Cancers (Basel). 2025 May 22;17(11):1731. doi: 10.3390/cancers17111731.
3
New techniques to identify the tissue of origin for cancer of unknown primary in the era of precision medicine: progress and challenges.

本文引用的文献

1
Diagnosis of Primary Clear Cell Carcinoma of the Vagina by 18F-FDG PET/CT.18F-FDG PET/CT 对阴道原发性透明细胞癌的诊断。
Clin Nucl Med. 2019 Apr;44(4):332-333. doi: 10.1097/RLU.0000000000002463.
2
Immunohistochemistry for Diagnosis of Metastatic Carcinomas of Unknown Primary Site.免疫组织化学在原发部位不明转移性癌诊断中的应用
Cancers (Basel). 2018 Apr 5;10(4):108. doi: 10.3390/cancers10040108.
3
Research on the mechanism of HP mediated PI3K/AKT/GSK3β pathways in gastric cancer.探讨 HP 介导的 PI3K/AKT/GSK3β 通路在胃癌中的作用机制。
精准医学时代识别不明原发癌组织来源的新技术:进展与挑战。
Brief Bioinform. 2024 Jan 22;25(2). doi: 10.1093/bib/bbae028.
4
A cross-cohort computational framework to trace tumor tissue-of-origin based on RNA sequencing.基于 RNA 测序的跨队列计算框架来追踪肿瘤组织起源。
Sci Rep. 2023 Sep 16;13(1):15356. doi: 10.1038/s41598-023-42465-8.
5
Molecular characterization of colorectal adenoma and colorectal cancer via integrated genomic transcriptomic analysis.通过综合基因组转录组分析对结直肠腺瘤和结直肠癌进行分子特征分析。
Front Oncol. 2023 Jul 21;13:1067849. doi: 10.3389/fonc.2023.1067849. eCollection 2023.
6
A deep learning model to classify neoplastic state and tissue origin from transcriptomic data.基于转录组数据的肿瘤状态和组织起源分类深度学习模型。
Sci Rep. 2022 Jun 11;12(1):9669. doi: 10.1038/s41598-022-13665-5.
7
Evaluating DNA Methylation, Gene Expression, Somatic Mutation, and Their Combinations in Inferring Tumor Tissue-of-Origin.评估DNA甲基化、基因表达、体细胞突变及其组合在推断肿瘤组织起源中的作用。
Front Cell Dev Biol. 2021 May 3;9:619330. doi: 10.3389/fcell.2021.619330. eCollection 2021.
8
Genomic pan-cancer classification using image-based deep learning.使用基于图像的深度学习进行基因组泛癌分类。
Comput Struct Biotechnol J. 2021 Jan 15;19:835-846. doi: 10.1016/j.csbj.2021.01.010. eCollection 2021.
Eur Rev Med Pharmacol Sci. 2017 Jul;21(3 Suppl):33-37.
4
A novel all-in-one intraoperative genotyping system for IDH1-mutant glioma.一种用于异柠檬酸脱氢酶1(IDH1)突变型胶质瘤的新型一体化术中基因分型系统。
Brain Tumor Pathol. 2017 Apr;34(2):91-97. doi: 10.1007/s10014-017-0281-0. Epub 2017 Mar 28.
5
Personalized oncogenomics in the management of gastrointestinal carcinomas-early experiences from a pilot study.胃肠道癌管理中的个性化肿瘤基因组学——一项试点研究的早期经验
Curr Oncol. 2016 Dec;23(6):e571-e575. doi: 10.3747/co.23.3165. Epub 2016 Dec 21.
6
Gogadget: An R Package for Interpretation and Visualization of GO Enrichment Results.Gogadget:用于解释和可视化 GO 富集结果的 R 包。
Mol Inform. 2017 May;36(5-6). doi: 10.1002/minf.201600132. Epub 2016 Dec 21.
7
Copy number variation analysis and methylome profiling of a GNAQ-mutant primary meningeal melanocytic tumor and its liver metastasis.GNAQ 突变型原发性脑膜黑素细胞肿瘤及其肝转移灶的拷贝数变异分析和甲基化组图谱分析
Exp Mol Pathol. 2017 Feb;102(1):25-31. doi: 10.1016/j.yexmp.2016.12.006. Epub 2016 Dec 11.
8
NTN4 is associated with breast cancer metastasis via regulation of EMT-related biomarkers.NTN4通过调控与上皮-间质转化(EMT)相关的生物标志物与乳腺癌转移相关。
Oncol Rep. 2017 Jan;37(1):449-457. doi: 10.3892/or.2016.5239. Epub 2016 Nov 9.
9
Identification of nasopharyngeal carcinoma metastasis-related biomarkers by iTRAQ combined with 2D-LC-MS/MS.采用iTRAQ联合二维液相色谱-串联质谱法鉴定鼻咽癌转移相关生物标志物。
Oncotarget. 2016 Jun 7;7(23):34022-37. doi: 10.18632/oncotarget.9067.
10
Erratum to: 'CompGO: an R package for comparing and visualizing Gene Ontology enrichment differences between DNA binding experiments'.《“CompGO:用于比较和可视化DNA结合实验之间基因本体富集差异的R包”勘误》
BMC Bioinformatics. 2016 Apr 25;17(1):179. doi: 10.1186/s12859-016-1048-z.