• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

对TCGA乳腺癌和肺癌转录组数据的主题建模分析。

A Topic Modeling Analysis of TCGA Breast and Lung Cancer Transcriptomic Data.

作者信息

Valle Filippo, Osella Matteo, Caselle Michele

机构信息

Physics Department, University of Turin and INFN, via P. Giuria 1, 10125 Turin, Italy.

出版信息

Cancers (Basel). 2020 Dec 16;12(12):3799. doi: 10.3390/cancers12123799.

DOI:10.3390/cancers12123799
PMID:33339347
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7766023/
Abstract

Topic modeling is a widely used technique to extract relevant information from large arrays of data. The problem of finding a topic structure in a dataset was recently recognized to be analogous to the community detection problem in network theory. Leveraging on this analogy, a new class of topic modeling strategies has been introduced to overcome some of the limitations of classical methods. This paper applies these recent ideas to TCGA transcriptomic data on breast and lung cancer. The established cancer subtype organization is well reconstructed in the inferred latent topic structure. Moreover, we identify specific topics that are enriched in genes known to play a role in the corresponding disease and are strongly related to the survival probability of patients. Finally, we show that a simple neural network classifier operating in the low dimensional topic space is able to predict with high accuracy the cancer subtype of a test expression sample.

摘要

主题建模是一种广泛使用的技术,用于从大量数据中提取相关信息。最近人们认识到,在数据集中寻找主题结构的问题类似于网络理论中的社区检测问题。基于这种类比,引入了一类新的主题建模策略,以克服经典方法的一些局限性。本文将这些最新思想应用于TCGA乳腺癌和肺癌转录组数据。在推断出的潜在主题结构中,已建立的癌症亚型组织得到了很好的重建。此外,我们识别出了特定的主题,这些主题在已知与相应疾病相关的基因中富集,并且与患者的生存概率密切相关。最后,我们表明,在低维主题空间中运行的简单神经网络分类器能够高精度地预测测试表达样本的癌症亚型。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/57cd/7766023/239c3803f881/cancers-12-03799-g012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/57cd/7766023/4e5cccdd9f8f/cancers-12-03799-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/57cd/7766023/477bc21e69ef/cancers-12-03799-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/57cd/7766023/f52ff40f8f45/cancers-12-03799-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/57cd/7766023/7b8bd797dd3a/cancers-12-03799-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/57cd/7766023/d02f3113c236/cancers-12-03799-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/57cd/7766023/361f8e227b98/cancers-12-03799-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/57cd/7766023/b19325cf50c2/cancers-12-03799-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/57cd/7766023/42265b926d29/cancers-12-03799-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/57cd/7766023/ceb0da8feb68/cancers-12-03799-g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/57cd/7766023/e48ee9c0365d/cancers-12-03799-g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/57cd/7766023/239c3803f881/cancers-12-03799-g012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/57cd/7766023/4e5cccdd9f8f/cancers-12-03799-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/57cd/7766023/477bc21e69ef/cancers-12-03799-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/57cd/7766023/f52ff40f8f45/cancers-12-03799-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/57cd/7766023/7b8bd797dd3a/cancers-12-03799-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/57cd/7766023/d02f3113c236/cancers-12-03799-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/57cd/7766023/361f8e227b98/cancers-12-03799-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/57cd/7766023/b19325cf50c2/cancers-12-03799-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/57cd/7766023/42265b926d29/cancers-12-03799-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/57cd/7766023/ceb0da8feb68/cancers-12-03799-g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/57cd/7766023/e48ee9c0365d/cancers-12-03799-g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/57cd/7766023/239c3803f881/cancers-12-03799-g012.jpg

相似文献

1
A Topic Modeling Analysis of TCGA Breast and Lung Cancer Transcriptomic Data.对TCGA乳腺癌和肺癌转录组数据的主题建模分析。
Cancers (Basel). 2020 Dec 16;12(12):3799. doi: 10.3390/cancers12123799.
2
Multiomics Topic Modeling for Breast Cancer Classification.用于乳腺癌分类的多组学主题建模
Cancers (Basel). 2022 Feb 23;14(5):1150. doi: 10.3390/cancers14051150.
3
Subtype prediction in pediatric acute myeloid leukemia: classification using differential network rank conservation revisited.儿童急性髓系白血病的亚型预测:重新审视使用差异网络秩守恒的分类方法
BMC Bioinformatics. 2015 Sep 23;16:305. doi: 10.1186/s12859-015-0737-3.
4
Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区,服用抗叶酸抗疟药物的人群中,叶酸补充剂与疟疾易感性和严重程度的关系。
Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.
5
Gene expression based survival prediction for cancer patients-A topic modeling approach.基于基因表达的癌症患者生存预测-一种主题建模方法。
PLoS One. 2019 Nov 15;14(11):e0224446. doi: 10.1371/journal.pone.0224446. eCollection 2019.
6
A hybrid deep learning framework for gene regulatory network inference from single-cell transcriptomic data.一种基于混合深度学习的方法,用于从单细胞转录组数据中推断基因调控网络。
Brief Bioinform. 2022 Mar 10;23(2). doi: 10.1093/bib/bbab568.
7
A gene expression-based single sample predictor of lung adenocarcinoma molecular subtype and prognosis.基于基因表达的肺腺癌分子亚型和预后的单一样本预测器。
Int J Cancer. 2021 Jan 1;148(1):238-251. doi: 10.1002/ijc.33242. Epub 2020 Aug 12.
8
A network approach to topic models.一种用于主题模型的网络方法。
Sci Adv. 2018 Jul 18;4(7):eaaq1360. doi: 10.1126/sciadv.aaq1360. eCollection 2018 Jul.
9
Smartphone Sensor-Based Human Motion Characterization with Neural Stochastic Differential Equations and Transformer Model.基于智能手机传感器的人类运动特征刻画:神经随机微分方程与 Transformer 模型
Sensors (Basel). 2022 Oct 2;22(19):7480. doi: 10.3390/s22197480.
10
Novel secretome-to-transcriptome integrated or secreto-transcriptomic approach to reveal liquid biopsy biomarkers for predicting individualized prognosis of breast cancer patients.新型外泌体-转录组整合或外泌体-转录组学方法揭示液体活检生物标志物,用于预测乳腺癌患者的个体化预后。
BMC Med Genomics. 2019 May 30;12(1):78. doi: 10.1186/s12920-019-0530-7.

引用本文的文献

1
Novel cancer subtyping method guided by tumor-normal sample in latent space of transcriptomic variational autoencoder.基于转录组变分自编码器潜在空间中肿瘤-正常样本引导的新型癌症亚型分类方法。
Sci Rep. 2025 Jul 21;15(1):26444. doi: 10.1038/s41598-025-07813-w.
2
Exploring the latent space of transcriptomic data with topic modeling.运用主题模型探索转录组数据的潜在空间。
NAR Genom Bioinform. 2025 Apr 22;7(2):lqaf049. doi: 10.1093/nargab/lqaf049. eCollection 2025 Jun.
3
Topic modeling analysis of the Allen Human Brain Atlas.艾伦人类大脑图谱的主题建模分析

本文引用的文献

1
Distinct signatures of lung cancer types: aberrant mucin O-glycosylation and compromised immune response.肺癌类型的独特特征:异常的粘蛋白 O-糖基化和受损的免疫反应。
BMC Cancer. 2019 Aug 20;19(1):824. doi: 10.1186/s12885-019-5965-x.
2
New functionalities in the TCGAbiolinks package for the study and integration of cancer data from GDC and GTEx.TCGAbiolinks 包中的新功能,用于研究和整合来自 GDC 和 GTEx 的癌症数据。
PLoS Comput Biol. 2019 Mar 5;15(3):e1006701. doi: 10.1371/journal.pcbi.1006701. eCollection 2019 Mar.
3
Hope4Genes: a Hopfield-like class prediction algorithm for transcriptomic data.
Sci Rep. 2025 Feb 26;15(1):6928. doi: 10.1038/s41598-025-91079-9.
4
Identification of Interpretable Clusters and Associated Signatures in Breast Cancer Single-Cell Data: A Topic Modeling Approach.乳腺癌单细胞数据中可解释聚类和相关特征的识别:一种主题建模方法。
Cancers (Basel). 2024 Mar 29;16(7):1350. doi: 10.3390/cancers16071350.
5
Molecular Subtyping of Cancer Based on Distinguishing Co-Expression Modules and Machine Learning.基于区分共表达模块和机器学习的癌症分子分型
Front Genet. 2022 May 2;13:866005. doi: 10.3389/fgene.2022.866005. eCollection 2022.
6
Multiomics Topic Modeling for Breast Cancer Classification.用于乳腺癌分类的多组学主题建模
Cancers (Basel). 2022 Feb 23;14(5):1150. doi: 10.3390/cancers14051150.
7
Incorporating External Information in Tissue Subtyping: A Topic Modeling Approach.在组织亚型分类中纳入外部信息:一种主题建模方法。
Proc Mach Learn Res. 2021;149:478-505.
8
Artificial Intelligence in Bulk and Single-Cell RNA-Sequencing Data to Foster Precision Oncology.人工智能在批量和单细胞 RNA 测序数据中促进精准肿瘤学。
Int J Mol Sci. 2021 Apr 27;22(9):4563. doi: 10.3390/ijms22094563.
9
Topic Evolution Analysis for Omics Data Integration in Cancers.癌症中组学数据整合的主题演变分析
Front Cell Dev Biol. 2021 Apr 7;9:631011. doi: 10.3389/fcell.2021.631011. eCollection 2021.
Hope4Genes:一种用于转录组数据的类 Hopfield 预测算法。
Sci Rep. 2019 Jan 23;9(1):337. doi: 10.1038/s41598-018-36744-y.
4
Discordance of the PAM50 Intrinsic Subtypes Compared with Immunohistochemistry-Based Surrogate in Breast Cancer Patients: Potential Implication of Genomic Alterations of Discordance.PAM50 内在亚型与乳腺癌患者免疫组织化学替代物的不相符:不相符的基因组改变的潜在意义。
Cancer Res Treat. 2019 Apr;51(2):737-747. doi: 10.4143/crt.2018.342. Epub 2018 Sep 5.
5
Zipf and Heaps laws from dependency structures in component systems.Zipf 定律和 Heaps 定律源自组件系统中的依赖结构。
Phys Rev E. 2018 Jul;98(1-1):012315. doi: 10.1103/PhysRevE.98.012315.
6
A network approach to topic models.一种用于主题模型的网络方法。
Sci Adv. 2018 Jul 18;4(7):eaaq1360. doi: 10.1126/sciadv.aaq1360. eCollection 2018 Jul.
7
Unifying cancer and normal RNA sequencing data from different sources.整合来自不同来源的癌症和正常 RNA 测序数据。
Sci Data. 2018 Apr 17;5:180061. doi: 10.1038/sdata.2018.61.
8
SCANPY: large-scale single-cell gene expression data analysis.SCANPY:大规模单细胞基因表达数据分析。
Genome Biol. 2018 Feb 6;19(1):15. doi: 10.1186/s13059-017-1382-0.
9
A review of computational approaches detecting microRNAs involved in cancer.癌症相关 microRNAs 计算方法研究综述。
Front Biosci (Landmark Ed). 2017 Jun 1;22(10):1774-1791. doi: 10.2741/4571.
10
Visualizing the structure of RNA-seq expression data using grade of membership models.使用隶属度模型可视化RNA测序表达数据的结构。
PLoS Genet. 2017 Mar 23;13(3):e1006599. doi: 10.1371/journal.pgen.1006599. eCollection 2017 Mar.