• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

利用 RNA-Seq 数据和梯度提升策略鉴定肿瘤组织起源。

Identification of Tumor Tissue of Origin with RNA-Seq Data and Using Gradient Boosting Strategy.

机构信息

School of Mathematics and Statistics, Hainan Normal University, Haikou 570100, China.

Key Laboratory of Computational Science and Application of Hainan Province, Haikou 571158, China.

出版信息

Biomed Res Int. 2021 Feb 17;2021:6653793. doi: 10.1155/2021/6653793. eCollection 2021.

DOI:10.1155/2021/6653793
PMID:33681364
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7904362/
Abstract

BACKGROUND

Cancer of unknown primary (CUP) is a type of malignant tumor, which is histologically diagnosed as a metastatic carcinoma while the tissue-of-origin cannot be identified. CUP accounts for roughly 5% of all cancers. Traditional treatment for CUP is primarily broad-spectrum chemotherapy; however, the prognosis is relatively poor. Thus, it is of clinical importance to accurately infer the tissue-of-origin of CUP.

METHODS

We developed a gradient boosting framework to trace tissue-of-origin of 20 types of solid tumors. Specifically, we downloaded the expression profiles of 20,501 genes for 7713 samples from The Cancer Genome Atlas (TCGA), which were used as the training data set. The RNA-seq data of 79 tumor samples from 6 cancer types with known origins were also downloaded from the Gene Expression Omnibus (GEO) for an independent data set.

RESULTS

400 genes were selected to train a gradient boosting model for identification of the primary site of the tumor. The overall 10-fold cross-validation accuracy of our method was 96.1% across 20 types of cancer, while the accuracy for the independent data set reached 83.5%.

CONCLUSION

Our gradient boosting framework was proven to be accurate in identifying tumor tissue-of-origin on both training data and independent testing data, which might be of practical usage.

摘要

背景

不明原发癌(CUP)是一种恶性肿瘤,组织学上被诊断为转移性癌,但无法确定其起源组织。CUP 约占所有癌症的 5%。CUP 的传统治疗主要是广谱化疗,但预后相对较差。因此,准确推断 CUP 的起源组织具有重要的临床意义。

方法

我们开发了一个梯度提升框架来追踪 20 种实体瘤的起源组织。具体来说,我们从癌症基因组图谱(TCGA)下载了 7713 个样本的 20501 个基因的表达谱,作为训练数据集。我们还从基因表达综合数据库(GEO)下载了来自 6 种已知起源癌症类型的 79 个肿瘤样本的 RNA-seq 数据,作为独立数据集。

结果

我们选择了 400 个基因来训练一个梯度提升模型,以识别肿瘤的原发部位。我们的方法在 20 种癌症的 10 倍交叉验证中的总体准确率为 96.1%,而在独立数据集上的准确率达到了 83.5%。

结论

我们的梯度提升框架在训练数据和独立测试数据上都证明了识别肿瘤起源组织的准确性,具有实际应用价值。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/de30/7904362/7bca88e12bc8/BMRI2021-6653793.008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/de30/7904362/36b5b78cceca/BMRI2021-6653793.001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/de30/7904362/83891050a74e/BMRI2021-6653793.002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/de30/7904362/26ecaf3aa68b/BMRI2021-6653793.003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/de30/7904362/341d1ba71a2b/BMRI2021-6653793.004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/de30/7904362/e60211e0dac3/BMRI2021-6653793.005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/de30/7904362/7de84d0253ba/BMRI2021-6653793.006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/de30/7904362/7bca88e12bc8/BMRI2021-6653793.008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/de30/7904362/36b5b78cceca/BMRI2021-6653793.001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/de30/7904362/83891050a74e/BMRI2021-6653793.002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/de30/7904362/26ecaf3aa68b/BMRI2021-6653793.003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/de30/7904362/341d1ba71a2b/BMRI2021-6653793.004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/de30/7904362/e60211e0dac3/BMRI2021-6653793.005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/de30/7904362/7de84d0253ba/BMRI2021-6653793.006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/de30/7904362/7bca88e12bc8/BMRI2021-6653793.008.jpg

相似文献

1
Identification of Tumor Tissue of Origin with RNA-Seq Data and Using Gradient Boosting Strategy.利用 RNA-Seq 数据和梯度提升策略鉴定肿瘤组织起源。
Biomed Res Int. 2021 Feb 17;2021:6653793. doi: 10.1155/2021/6653793. eCollection 2021.
2
A Machine Learning Method to Trace Cancer Primary Lesion Using Microarray-Based Gene Expression Data.一种使用基于微阵列的基因表达数据追踪癌症原发灶的机器学习方法。
Front Oncol. 2022 Apr 21;12:832567. doi: 10.3389/fonc.2022.832567. eCollection 2022.
3
CUP-AI-Dx: A tool for inferring cancer tissue of origin and molecular subtype using RNA gene-expression data and artificial intelligence.CUP-AI-Dx:一种使用 RNA 基因表达数据和人工智能推断癌症组织来源和分子亚型的工具。
EBioMedicine. 2020 Nov;61:103030. doi: 10.1016/j.ebiom.2020.103030. Epub 2020 Oct 9.
4
Evaluating DNA Methylation, Gene Expression, Somatic Mutation, and Their Combinations in Inferring Tumor Tissue-of-Origin.评估DNA甲基化、基因表达、体细胞突变及其组合在推断肿瘤组织起源中的作用。
Front Cell Dev Biol. 2021 May 3;9:619330. doi: 10.3389/fcell.2021.619330. eCollection 2021.
5
TOD-CUP: a gene expression rank-based majority vote algorithm for tissue origin diagnosis of cancers of unknown primary.TOD-CUP:一种基于基因表达排序的多数投票算法,用于诊断不明原发灶癌症的组织来源。
Brief Bioinform. 2021 Mar 22;22(2):2106-2118. doi: 10.1093/bib/bbaa031.
6
A cross-cohort computational framework to trace tumor tissue-of-origin based on RNA sequencing.基于 RNA 测序的跨队列计算框架来追踪肿瘤组织起源。
Sci Rep. 2023 Sep 16;13(1):15356. doi: 10.1038/s41598-023-42465-8.
7
TOOme: A Novel Computational Framework to Infer Cancer Tissue-of-Origin by Integrating Both Gene Mutation and Expression.TOOme:一种通过整合基因突变和表达来推断癌症组织起源的新型计算框架。
Front Bioeng Biotechnol. 2020 May 19;8:394. doi: 10.3389/fbioe.2020.00394. eCollection 2020.
8
RNA-Seq accurately identifies cancer biomarker signatures to distinguish tissue of origin.RNA测序能准确识别癌症生物标志物特征以区分组织来源。
Neoplasia. 2014 Nov 20;16(11):918-27. doi: 10.1016/j.neo.2014.09.007. eCollection 2014 Nov.
9
Identifying cancer tissue-of-origin by a novel machine learning method based on expression quantitative trait loci.通过基于表达数量性状位点的新型机器学习方法识别癌症组织起源。
Front Oncol. 2022 Aug 9;12:946552. doi: 10.3389/fonc.2022.946552. eCollection 2022.
10
Identification and validation of 12 immune-related genes as a prognostic signature for colon adenocarcinoma.鉴定和验证 12 个免疫相关基因作为结直肠腺癌的预后标志物。
J Biochem Mol Toxicol. 2021 Sep;35(9):e22852. doi: 10.1002/jbt.22852. Epub 2021 Aug 15.

引用本文的文献

1
AITeQ: a machine learning framework for Alzheimer's prediction using a distinctive five-gene signature.AITeQ:使用独特的五基因特征进行阿尔茨海默病预测的机器学习框架。
Brief Bioinform. 2024 May 23;25(4). doi: 10.1093/bib/bbae291.
2
New techniques to identify the tissue of origin for cancer of unknown primary in the era of precision medicine: progress and challenges.精准医学时代识别不明原发癌组织来源的新技术:进展与挑战。
Brief Bioinform. 2024 Jan 22;25(2). doi: 10.1093/bib/bbae028.
3
Retracted: Identification of Tumor Tissue of Origin with RNA-Seq Data and Using Gradient Boosting Strategy.

本文引用的文献

1
A Deep Learning-Based Chemical System for QSAR Prediction.基于深度学习的定量构效关系预测化学系统。
IEEE J Biomed Health Inform. 2020 Oct;24(10):3020-3028. doi: 10.1109/JBHI.2020.2977009. Epub 2020 Feb 28.
2
The Cdk2-c-Myc-miR-571 Axis Regulates DNA Replication and Genomic Stability by Targeting Geminin.Cdk2-c-Myc-miR-571 轴通过靶向 Geminin 调节 DNA 复制和基因组稳定性。
Cancer Res. 2019 Oct 1;79(19):4896-4910. doi: 10.1158/0008-5472.CAN-19-0020. Epub 2019 Aug 20.
3
Gene Expression Profiling for Diagnosis of Triple-Negative Breast Cancer: A Multicenter, Retrospective Cohort Study.
撤回:利用RNA测序数据并采用梯度提升策略鉴定肿瘤组织来源
Biomed Res Int. 2023 Nov 29;2023:9865973. doi: 10.1155/2023/9865973. eCollection 2023.
4
A cross-cohort computational framework to trace tumor tissue-of-origin based on RNA sequencing.基于 RNA 测序的跨队列计算框架来追踪肿瘤组织起源。
Sci Rep. 2023 Sep 16;13(1):15356. doi: 10.1038/s41598-023-42465-8.
5
Intra-tumor heterogeneity, turnover rate and karyotype space shape susceptibility to missegregation-induced extinction.肿瘤内异质性、周转率和核型空间影响易位诱导灭绝的敏感性。
PLoS Comput Biol. 2023 Jan 23;19(1):e1010815. doi: 10.1371/journal.pcbi.1010815. eCollection 2023 Jan.
6
Pragmatic Expectancy on Microbiota and Non-Small Cell Lung Cancer: A Narrative Review.微生物群与非小细胞肺癌的实用预期:一篇叙述性综述
Cancers (Basel). 2022 Jun 26;14(13):3131. doi: 10.3390/cancers14133131.
7
Artificial intelligence and machine learning approaches using gene expression and variant data for personalized medicine.基于基因表达和变异数据的人工智能和机器学习方法在个性化医疗中的应用。
Brief Bioinform. 2022 Sep 20;23(5). doi: 10.1093/bib/bbac191.
8
A Machine Learning Method to Trace Cancer Primary Lesion Using Microarray-Based Gene Expression Data.一种使用基于微阵列的基因表达数据追踪癌症原发灶的机器学习方法。
Front Oncol. 2022 Apr 21;12:832567. doi: 10.3389/fonc.2022.832567. eCollection 2022.
9
90-Gene Expression Profiling for Tissue Origin Diagnosis of Cancer of Unknown Primary.用于未知原发癌组织起源诊断的90基因表达谱分析
Front Oncol. 2021 Oct 7;11:722808. doi: 10.3389/fonc.2021.722808. eCollection 2021.
用于三阴性乳腺癌诊断的基因表达谱分析:一项多中心回顾性队列研究
Front Oncol. 2019 May 7;9:354. doi: 10.3389/fonc.2019.00354. eCollection 2019.
4
A prognostic 11 long noncoding RNA expression signature for breast invasive carcinoma.用于乳腺浸润性癌的预后 11 号长非编码 RNA 表达特征。
J Cell Biochem. 2019 Oct;120(10):16692-16702. doi: 10.1002/jcb.28927. Epub 2019 May 16.
5
Gastric metastasis of ovarian serous cystadenocarcinoma.卵巢浆液性囊腺癌的胃转移
Int Med Case Rep J. 2018 Sep 5;11:201-204. doi: 10.2147/IMCRJ.S171985. eCollection 2018.
6
miR-137 mediates the functional link between c-Myc and EZH2 that regulates cisplatin resistance in ovarian cancer.miR-137 介导 c-Myc 和 EZH2 之间的功能联系,调节卵巢癌对顺铂的耐药性。
Oncogene. 2019 Jan;38(4):564-580. doi: 10.1038/s41388-018-0459-x. Epub 2018 Aug 30.
7
Bioinformatics analysis of RNA sequencing data reveals multiple key genes in uterine corpus endometrial carcinoma.RNA测序数据的生物信息学分析揭示了子宫体子宫内膜癌中的多个关键基因。
Oncol Lett. 2018 Jan;15(1):205-212. doi: 10.3892/ol.2017.7346. Epub 2017 Nov 3.
8
Reproductive factors and incidence of endometrial cancer in U.S. black women.美国黑人女性的生殖因素与子宫内膜癌发病率
Cancer Causes Control. 2017 Jun;28(6):579-588. doi: 10.1007/s10552-017-0880-4. Epub 2017 Mar 30.
9
How to Diagnose and Treat a Cancer of Unknown Primary Site.如何诊断和治疗原发部位不明的癌症。
J Gastrointestin Liver Dis. 2017 Mar;26(1):69-79. doi: 10.15403/jgld.2014.1121.261.haz.
10
Dysregulation of the homeobox transcription factor gene HOXB13: role in prostate cancer.同源框转录因子基因HOXB13的失调:在前列腺癌中的作用
Pharmgenomics Pers Med. 2014 Aug 5;7:193-201. doi: 10.2147/PGPM.S38117. eCollection 2014.