• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过具有增强一致性的多视图聚类,从基因组规模的异质TCGA数据集中进行亚型识别。

Subtype identification from heterogeneous TCGA datasets on a genomic scale by multi-view clustering with enhanced consensus.

作者信息

Cai Menglan, Li Limin

机构信息

School of Mathematics and Statistics, Xi'an Jiaotong University, Xianning West 28, Xi'an, China.

出版信息

BMC Med Genomics. 2017 Dec 21;10(Suppl 4):75. doi: 10.1186/s12920-017-0306-x.

DOI:10.1186/s12920-017-0306-x
PMID:29322925
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5763310/
Abstract

BACKGROUND

The Cancer Genome Atlas (TCGA) has collected transcriptome, genome and epigenome information for over 20 cancers from thousands of patients. The availability of these diverse data types makes it necessary to combine these data to capture the heterogeneity of biological processes and phenotypes and further identify homogeneous subtypes for cancers such as breast cancer. Many multi-view clustering approaches are proposed to discover clusters across different data types. The problem is challenging when different data types show poor agreement of clustering structure.

RESULTS

In this work, we first propose a multi-view clustering approach with consensus (CMC), which tries to find consensus kernels among views by using Hilbert Schmidt Independence Criterion. To tackle the problem when poor agreement among views exists, we further propose a multi-view clustering approach with enhanced consensus (ECMC) to solve this problem by decomposing the kernel information in each view into a consensus part and a disagreement part. The consensus parts for different views are supposed to be similar, and the disagreement parts should be independent with the consensus parts. Both the CMC and ECMC models can be solved by alternative updating with semi-definite programming. Our experiments on both simulation datasets and real-world benchmark datasets show that ECMC model could achieve higher clustering accuracies than other state-of-art multi-view clustering approaches. We also apply the ECMC model to integrate mRNA expression, DNA methylation and microRNA (miRNA) expression data for five cancer data sets, and the survival analysis show that our ECMC model outperforms other methods when identifying cancer subtypes. By Fisher's combination test method, we found that three computed subtypes roughly correspond to three known breast cancer subtypes including luminal B, HER2 and basal-like subtypes.

CONCLUSION

Integrating heterogeneous TCGA datasets by our proposed multi-view clustering approach ECMC could effectively identify cancer subtypes.

摘要

背景

癌症基因组图谱(TCGA)已收集了数千名患者的20多种癌症的转录组、基因组和表观基因组信息。这些多样的数据类型使得有必要将这些数据结合起来,以捕捉生物过程和表型的异质性,并进一步识别乳腺癌等癌症的同质亚型。许多多视图聚类方法被提出来发现不同数据类型中的聚类。当不同数据类型的聚类结构一致性较差时,这个问题具有挑战性。

结果

在这项工作中,我们首先提出了一种具有一致性的多视图聚类方法(CMC),该方法试图通过使用希尔伯特 - 施密特独立性准则在视图之间找到一致核。为了解决视图之间一致性较差的问题,我们进一步提出了一种具有增强一致性的多视图聚类方法(ECMC),通过将每个视图中的核信息分解为一个一致部分和一个不一致部分来解决这个问题。不同视图的一致部分应该相似,并且不一致部分应该与一致部分独立。CMC和ECMC模型都可以通过半定规划的交替更新来求解。我们在模拟数据集和真实世界基准数据集上的实验表明,ECMC模型比其他现有的多视图聚类方法能够获得更高的聚类准确率。我们还将ECMC模型应用于整合五个癌症数据集的mRNA表达、DNA甲基化和 microRNA(miRNA)表达数据,生存分析表明,我们的ECMC模型在识别癌症亚型方面优于其他方法。通过费舍尔组合检验方法,我们发现三个计算出的亚型大致对应于三种已知的乳腺癌亚型,包括管腔B型、HER2型和基底样亚型。

结论

通过我们提出的多视图聚类方法ECMC整合异质的TCGA数据集可以有效地识别癌症亚型。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/563d/5763310/b24d779b44c6/12920_2017_306_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/563d/5763310/ce0224021673/12920_2017_306_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/563d/5763310/0b4c3b7ae41d/12920_2017_306_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/563d/5763310/76d88c435e52/12920_2017_306_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/563d/5763310/7a9de6ab766b/12920_2017_306_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/563d/5763310/513ba29c17f2/12920_2017_306_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/563d/5763310/968d6af565a5/12920_2017_306_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/563d/5763310/b24d779b44c6/12920_2017_306_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/563d/5763310/ce0224021673/12920_2017_306_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/563d/5763310/0b4c3b7ae41d/12920_2017_306_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/563d/5763310/76d88c435e52/12920_2017_306_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/563d/5763310/7a9de6ab766b/12920_2017_306_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/563d/5763310/513ba29c17f2/12920_2017_306_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/563d/5763310/968d6af565a5/12920_2017_306_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/563d/5763310/b24d779b44c6/12920_2017_306_Fig7_HTML.jpg

相似文献

1
Subtype identification from heterogeneous TCGA datasets on a genomic scale by multi-view clustering with enhanced consensus.通过具有增强一致性的多视图聚类,从基因组规模的异质TCGA数据集中进行亚型识别。
BMC Med Genomics. 2017 Dec 21;10(Suppl 4):75. doi: 10.1186/s12920-017-0306-x.
2
Integrative subspace clustering by common and specific decomposition for applications on cancer subtype identification.基于共同和特异分解的集成子空间聚类在癌症亚型识别中的应用
BMC Med Genomics. 2019 Dec 24;12(Suppl 9):191. doi: 10.1186/s12920-019-0633-1.
3
Bregmannian consensus clustering for cancer subtypes analysis.Bregmannian 一致性聚类分析用于癌症亚型分析。
Comput Methods Programs Biomed. 2020 Jun;189:105337. doi: 10.1016/j.cmpb.2020.105337. Epub 2020 Jan 13.
4
COPS: A novel platform for multi-omic disease subtype discovery via robust multi-objective evaluation of clustering algorithms.COPS:一种通过稳健的聚类算法多目标评估发现多组学疾病亚型的新平台。
PLoS Comput Biol. 2024 Aug 5;20(8):e1012275. doi: 10.1371/journal.pcbi.1012275. eCollection 2024 Aug.
5
Improvement of cancer subtype prediction by incorporating transcriptome expression data and heterogeneous biological networks.通过整合转录组表达数据和异质生物网络改进癌症亚型预测
BMC Med Genomics. 2018 Dec 31;11(Suppl 6):119. doi: 10.1186/s12920-018-0435-x.
6
Expression and methylation patterns partition luminal-A breast tumors into distinct prognostic subgroups.表达和甲基化模式将腔面A型乳腺肿瘤分为不同的预后亚组。
Breast Cancer Res. 2016 Jul 7;18(1):74. doi: 10.1186/s13058-016-0724-2.
7
Breast cancer patient stratification using a molecular regularized consensus clustering method.使用分子正则化共识聚类方法对乳腺癌患者进行分层。
Methods. 2014 Jun 1;67(3):304-12. doi: 10.1016/j.ymeth.2014.03.005. Epub 2014 Mar 18.
8
Consensus guided incomplete multi-view spectral clustering.共识指导的不完全多视图谱聚类。
Neural Netw. 2021 Jan;133:207-219. doi: 10.1016/j.neunet.2020.10.014. Epub 2020 Nov 11.
9
Multi-omic and multi-view clustering algorithms: review and cancer benchmark.多组学和多视角聚类算法:综述和癌症基准测试。
Nucleic Acids Res. 2018 Nov 16;46(20):10546-10562. doi: 10.1093/nar/gky889.
10
Clusternomics: Integrative context-dependent clustering for heterogeneous datasets.聚类组学:针对异构数据集的整合上下文相关聚类
PLoS Comput Biol. 2017 Oct 16;13(10):e1005781. doi: 10.1371/journal.pcbi.1005781. eCollection 2017 Oct.

引用本文的文献

1
Characterization of Expression-Based Gene Clusters Gives Insights into Variation in Patient Response to Cancer Therapies.基于表达的基因簇的特征分析为深入了解患者对癌症治疗反应的差异提供了线索。
Cancer Inform. 2024 Sep 4;23:11769351241271560. doi: 10.1177/11769351241271560. eCollection 2024.
2
COPS: A novel platform for multi-omic disease subtype discovery via robust multi-objective evaluation of clustering algorithms.COPS:一种通过稳健的聚类算法多目标评估发现多组学疾病亚型的新平台。
PLoS Comput Biol. 2024 Aug 5;20(8):e1012275. doi: 10.1371/journal.pcbi.1012275. eCollection 2024 Aug.
3
Multi-omics subtyping of hepatocellular carcinoma patients using a Bayesian network mixture model.

本文引用的文献

1
Pattern fusion analysis by adaptive alignment of multiple heterogeneous omics data.通过自适应对齐多种异构组学数据进行模式融合分析。
Bioinformatics. 2017 Sep 1;33(17):2706-2714. doi: 10.1093/bioinformatics/btx176.
2
Similarity network fusion for aggregating data types on a genomic scale.基于基因组尺度聚合数据类型的相似网络融合。
Nat Methods. 2014 Mar;11(3):333-7. doi: 10.1038/nmeth.2810. Epub 2014 Jan 26.
3
Pattern discovery and cancer gene identification in integrated cancer genomic data.整合癌症基因组数据中的模式发现和癌症基因鉴定。
基于贝叶斯网络混合模型的肝细胞癌患者多组学亚群分型。
PLoS Comput Biol. 2022 Sep 6;18(9):e1009767. doi: 10.1371/journal.pcbi.1009767. eCollection 2022 Sep.
4
Identification of specific microRNA-messenger RNA regulation pairs in four subtypes of breast cancer.四种乳腺癌亚型中特定微小RNA-信使核糖核酸调控对的鉴定
IET Syst Biol. 2020 Jun;14(3):120-126. doi: 10.1049/iet-syb.2019.0086.
5
Integrative subspace clustering by common and specific decomposition for applications on cancer subtype identification.基于共同和特异分解的集成子空间聚类在癌症亚型识别中的应用
BMC Med Genomics. 2019 Dec 24;12(Suppl 9):191. doi: 10.1186/s12920-019-0633-1.
6
A bioinformatics potpourri.生物信息学大杂烩。
BMC Genomics. 2018 Jan 19;19(Suppl 1):920. doi: 10.1186/s12864-017-4326-x.
Proc Natl Acad Sci U S A. 2013 Mar 12;110(11):4245-50. doi: 10.1073/pnas.1208949110. Epub 2013 Feb 21.
4
Alterations of EGFR, p53 and PTEN that mimic changes found in basal-like breast cancer promote transformation of human mammary epithelial cells.模拟基底样乳腺癌中发现的改变的 EGFR、p53 和 PTEN 促进人乳腺上皮细胞的转化。
Cancer Biol Ther. 2013 Mar;14(3):246-53. doi: 10.4161/cbt.23297. Epub 2013 Jan 4.
5
Hotspot mutations in H3F3A and IDH1 define distinct epigenetic and biological subgroups of glioblastoma.H3F3A 和 IDH1 热点突变定义了胶质母细胞瘤的独特表观遗传和生物学亚群。
Cancer Cell. 2012 Oct 16;22(4):425-37. doi: 10.1016/j.ccr.2012.08.024.
6
PPM1H is a p27 phosphatase implicated in trastuzumab resistance.PPM1H 是一种 p27 磷酸酶,与曲妥珠单抗耐药有关。
Cancer Discov. 2011 Sep;1(4):326-37. doi: 10.1158/2159-8290.CD-11-0062. Epub 2011 Jul 20.
7
The genomic and transcriptomic architecture of 2,000 breast tumours reveals novel subgroups.2000 个乳腺肿瘤的基因组和转录组结构揭示了新的亚群。
Nature. 2012 Apr 18;486(7403):346-52. doi: 10.1038/nature10983.
8
Optimized data fusion for kernel k-means clustering.核 K-均值聚类的数据优化融合。
IEEE Trans Pattern Anal Mach Intell. 2012 May;34(5):1031-9. doi: 10.1109/TPAMI.2011.255.
9
ZNF703 is a common Luminal B breast cancer oncogene that differentially regulates luminal and basal progenitors in human mammary epithelium.ZNF703 是一种常见的腔 B 型乳腺癌致癌基因,可在人乳腺上皮中差异调节腔和基底祖细胞。
EMBO Mol Med. 2011 Mar;3(3):167-80. doi: 10.1002/emmm.201100122. Epub 2011 Feb 18.
10
PIK3CA mutations associated with gene signature of low mTORC1 signaling and better outcomes in estrogen receptor-positive breast cancer.PIK3CA突变与雌激素受体阳性乳腺癌中低mTORC1信号传导的基因特征及更好的预后相关。
Proc Natl Acad Sci U S A. 2010 Jun 1;107(22):10208-13. doi: 10.1073/pnas.0907011107. Epub 2010 May 17.