• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

五种人类肿瘤类型的分类揭示了特定的生物标志物和背景分类基因。

Sorting Five Human Tumor Types Reveals Specific Biomarkers and Background Classification Genes.

机构信息

Clemson University, Department of Genetics & Biochemistry, Clemson, 29634, SC, USA.

Quantum Insights Inc., Menlo Park, 94025, California, USA.

出版信息

Sci Rep. 2018 May 25;8(1):8180. doi: 10.1038/s41598-018-26310-x.

DOI:10.1038/s41598-018-26310-x
PMID:29802335
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5970138/
Abstract

We applied two state-of-the-art, knowledge independent data-mining methods - Dynamic Quantum Clustering (DQC) and t-Distributed Stochastic Neighbor Embedding (t-SNE) - to data from The Cancer Genome Atlas (TCGA). We showed that the RNA expression patterns for a mixture of 2,016 samples from five tumor types can sort the tumors into groups enriched for relevant annotations including tumor type, gender, tumor stage, and ethnicity. DQC feature selection analysis discovered 48 core biomarker transcripts that clustered tumors by tumor type. When these transcripts were removed, the geometry of tumor relationships changed, but it was still possible to classify the tumors using the RNA expression profiles of the remaining transcripts. We continued to remove the top biomarkers for several iterations and performed cluster analysis. Even though the most informative transcripts were removed from the cluster analysis, the sorting ability of remaining transcripts remained strong after each iteration. Further, in some iterations we detected a repeating pattern of biological function that wasn't detectable with the core biomarker transcripts present. This suggests the existence of a "background classification" potential in which the pattern of gene expression after continued removal of "biomarker" transcripts could still classify tumors in agreement with the tumor type.

摘要

我们应用了两种最先进的、与知识无关的数据挖掘方法——动态量子聚类(DQC)和 t 分布随机邻域嵌入(t-SNE)——来分析来自癌症基因组图谱(TCGA)的数据。我们表明,从五种肿瘤类型的 2016 个混合样本的 RNA 表达模式可以将肿瘤按与肿瘤类型、性别、肿瘤分期和种族相关的注释进行分组。DQC 特征选择分析发现了 48 个核心生物标志物转录本,可以根据肿瘤类型对肿瘤进行聚类。当这些转录本被去除后,肿瘤关系的几何形状发生了变化,但仍然可以使用剩余转录本的 RNA 表达谱对肿瘤进行分类。我们继续进行几次迭代,去除了前几个生物标志物,并进行了聚类分析。即使从聚类分析中去除了最具信息量的转录本,在每次迭代后,剩余转录本的分类能力仍然很强。此外,在某些迭代中,我们检测到了一种重复的生物学功能模式,这在存在核心生物标志物转录本的情况下是无法检测到的。这表明存在一种“背景分类”的潜在可能性,即在继续去除“生物标志物”转录本后,基因表达模式仍然可以与肿瘤类型一致地对肿瘤进行分类。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4e29/5970138/57bc6d3d811c/41598_2018_26310_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4e29/5970138/f685aae65aea/41598_2018_26310_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4e29/5970138/f8177ba9f74b/41598_2018_26310_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4e29/5970138/265c83d70052/41598_2018_26310_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4e29/5970138/a2251aa922aa/41598_2018_26310_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4e29/5970138/300b9ef4c65c/41598_2018_26310_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4e29/5970138/9d51ec9dc5f5/41598_2018_26310_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4e29/5970138/ab24a1128858/41598_2018_26310_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4e29/5970138/57bc6d3d811c/41598_2018_26310_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4e29/5970138/f685aae65aea/41598_2018_26310_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4e29/5970138/f8177ba9f74b/41598_2018_26310_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4e29/5970138/265c83d70052/41598_2018_26310_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4e29/5970138/a2251aa922aa/41598_2018_26310_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4e29/5970138/300b9ef4c65c/41598_2018_26310_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4e29/5970138/9d51ec9dc5f5/41598_2018_26310_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4e29/5970138/ab24a1128858/41598_2018_26310_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4e29/5970138/57bc6d3d811c/41598_2018_26310_Fig8_HTML.jpg

相似文献

1
Sorting Five Human Tumor Types Reveals Specific Biomarkers and Background Classification Genes.五种人类肿瘤类型的分类揭示了特定的生物标志物和背景分类基因。
Sci Rep. 2018 May 25;8(1):8180. doi: 10.1038/s41598-018-26310-x.
2
Sequential analysis of transcript expression patterns improves survival prediction in multiple cancers.转录表达模式的序贯分析提高了多种癌症的生存预测。
BMC Cancer. 2020 Apr 7;20(1):297. doi: 10.1186/s12885-020-06756-x.
3
Expression patterns of small numbers of transcripts from functionally-related pathways predict survival in multiple cancers.少数与功能相关途径的转录本的表达模式可预测多种癌症的生存情况。
BMC Cancer. 2019 Jul 12;19(1):686. doi: 10.1186/s12885-019-5851-6.
4
Analyzing the similarity of samples and genes by MG-PCC algorithm, t-SNE-SS and t-SNE-SG maps.通过 MG-PCC 算法、t-SNE-SS 和 t-SNE-SG 图谱分析样本和基因的相似性。
BMC Bioinformatics. 2018 Dec 17;19(1):512. doi: 10.1186/s12859-018-2495-5.
5
A comprehensive genomic pan-cancer classification using The Cancer Genome Atlas gene expression data.利用癌症基因组图谱基因表达数据进行的全面基因组泛癌分类。
BMC Genomics. 2017 Jul 3;18(1):508. doi: 10.1186/s12864-017-3906-0.
6
Impact of Tumor Purity on Immune Gene Expression and Clustering Analyses across Multiple Cancer Types.肿瘤纯度对多种癌症类型免疫基因表达和聚类分析的影响。
Cancer Immunol Res. 2018 Jan;6(1):87-97. doi: 10.1158/2326-6066.CIR-17-0201. Epub 2017 Nov 15.
7
Differential gene expression profiling of gastric intraepithelial neoplasia and early-stage adenocarcinoma.胃上皮内瘤变和早期腺癌的差异基因表达谱分析。
World J Gastroenterol. 2014 Dec 21;20(47):17883-93. doi: 10.3748/wjg.v20.i47.17883.
8
GLAD: a mixed-membership model for heterogeneous tumor subtype classification.GLAD:一种用于异质性肿瘤亚型分类的混合成员模型。
Bioinformatics. 2015 Jan 15;31(2):225-32. doi: 10.1093/bioinformatics/btu618. Epub 2014 Sep 29.
9
Identification of Disease-miRNA Networks Across Different Cancer Types Using SWIM.使用SWIM识别不同癌症类型中的疾病- miRNA网络。
Methods Mol Biol. 2019;1970:169-181. doi: 10.1007/978-1-4939-9207-2_10.
10
Machine learning analysis of gene expression data reveals novel diagnostic and prognostic biomarkers and identifies therapeutic targets for soft tissue sarcomas.基于基因表达数据的机器学习分析揭示了软组织肉瘤的新型诊断和预后生物标志物,并确定了治疗靶点。
PLoS Comput Biol. 2019 Feb 20;15(2):e1006826. doi: 10.1371/journal.pcbi.1006826. eCollection 2019 Feb.

引用本文的文献

1
Transcriptional profiles reveal histologic origin and prognosis across 33 The Cancer Genome Atlas tumor types.转录谱揭示了33种癌症基因组图谱肿瘤类型的组织学起源和预后。
Transl Cancer Res. 2023 Oct 31;12(10):2764-2780. doi: 10.21037/tcr-23-234. Epub 2023 Sep 20.
2
Diagnosis of Acute Leukemia by Multiparameter Flow Cytometry with the Assistance of Artificial Intelligence.借助人工智能通过多参数流式细胞术诊断急性白血病
Diagnostics (Basel). 2022 Mar 28;12(4):827. doi: 10.3390/diagnostics12040827.
3
Cellular State Transformations Using Deep Learning for Precision Medicine Applications.

本文引用的文献

1
Discovering Condition-Specific Gene Co-Expression Patterns Using Gaussian Mixture Models: A Cancer Case Study.利用高斯混合模型发现条件特异性基因共表达模式:癌症案例研究。
Sci Rep. 2017 Aug 17;7(1):8617. doi: 10.1038/s41598-017-09094-4.
2
A comprehensive genomic pan-cancer classification using The Cancer Genome Atlas gene expression data.利用癌症基因组图谱基因表达数据进行的全面基因组泛癌分类。
BMC Genomics. 2017 Jul 3;18(1):508. doi: 10.1186/s12864-017-3906-0.
3
An additional k-means clustering step improves the biological features of WGCNA gene co-expression networks.
利用深度学习实现精准医学应用中的细胞状态转变
Patterns (N Y). 2020 Aug 17;1(6):100087. doi: 10.1016/j.patter.2020.100087. eCollection 2020 Sep 11.
4
Uncovering biomarker genes with enriched classification potential from Hallmark gene sets.从 Hallmark 基因集中挖掘具有富集分类潜力的生物标志物基因。
Sci Rep. 2019 Jul 5;9(1):9747. doi: 10.1038/s41598-019-46059-1.
5
Data mining to understand health status preceding traumatic brain injury.数据挖掘以了解创伤性脑损伤之前的健康状况。
Sci Rep. 2019 Apr 3;9(1):5574. doi: 10.1038/s41598-019-41916-5.
6
Analyzing the similarity of samples and genes by MG-PCC algorithm, t-SNE-SS and t-SNE-SG maps.通过 MG-PCC 算法、t-SNE-SS 和 t-SNE-SG 图谱分析样本和基因的相似性。
BMC Bioinformatics. 2018 Dec 17;19(1):512. doi: 10.1186/s12859-018-2495-5.
额外的k均值聚类步骤改善了WGCNA基因共表达网络的生物学特征。
BMC Syst Biol. 2017 Apr 12;11(1):47. doi: 10.1186/s12918-017-0420-6.
4
The dbGaP data browser: a new tool for browsing dbGaP controlled-access genomic data.数据库基因和蛋白质组学(dbGaP)数据浏览器:一种用于浏览dbGaP受限访问基因组数据的新工具。
Nucleic Acids Res. 2017 Jan 4;45(D1):D819-D826. doi: 10.1093/nar/gkw1139. Epub 2016 Nov 29.
5
InterPro in 2017-beyond protein family and domain annotations.2017年的InterPro——超越蛋白质家族和结构域注释
Nucleic Acids Res. 2017 Jan 4;45(D1):D190-D199. doi: 10.1093/nar/gkw1107. Epub 2016 Nov 29.
6
Pan-cancer subtyping in a 2D-map shows substructures that are driven by specific combinations of molecular characteristics.二维图谱中的泛癌亚型分析显示了由分子特征的特定组合驱动的子结构。
Sci Rep. 2016 Apr 25;6:24949. doi: 10.1038/srep24949.
7
2D Representation of Transcriptomes by t-SNE Exposes Relatedness between Human Tissues.通过t-SNE对转录组进行二维表示揭示了人体组织之间的相关性。
PLoS One. 2016 Feb 23;11(2):e0149853. doi: 10.1371/journal.pone.0149853. eCollection 2016.
8
Molecular Profiling Reveals Biologically Discrete Subsets and Pathways of Progression in Diffuse Glioma.分子分析揭示弥漫性胶质瘤的生物学离散亚群和进展途径。
Cell. 2016 Jan 28;164(3):550-63. doi: 10.1016/j.cell.2015.12.028.
9
The Pfam protein families database: towards a more sustainable future.Pfam蛋白质家族数据库:迈向更可持续的未来。
Nucleic Acids Res. 2016 Jan 4;44(D1):D279-85. doi: 10.1093/nar/gkv1344. Epub 2015 Dec 15.
10
The Reactome pathway Knowledgebase.Reactome通路知识库。
Nucleic Acids Res. 2016 Jan 4;44(D1):D481-7. doi: 10.1093/nar/gkv1351. Epub 2015 Dec 9.