• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于 Stiefel 流形的多视图癌症数据聚类。

Clustering of cancer data based on Stiefel manifold for multiple views.

机构信息

College of Mathematics and System Sciences, Xinjiang University, Urumqi, China.

School of Computer Science and Technology, Anhui University, Hefei, China.

出版信息

BMC Bioinformatics. 2021 May 25;22(1):268. doi: 10.1186/s12859-021-04195-4.

DOI:10.1186/s12859-021-04195-4
PMID:34034643
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8152349/
Abstract

BACKGROUND

In recent years, various sequencing techniques have been used to collect biomedical omics datasets. It is usually possible to obtain multiple types of omics data from a single patient sample. Clustering of omics data plays an indispensable role in biological and medical research, and it is helpful to reveal data structures from multiple collections. Nevertheless, clustering of omics data consists of many challenges. The primary challenges in omics data analysis come from high dimension of data and small size of sample. Therefore, it is difficult to find a suitable integration method for structural analysis of multiple datasets.

RESULTS

In this paper, a multi-view clustering based on Stiefel manifold method (MCSM) is proposed. The MCSM method comprises three core steps. Firstly, we established a binary optimization model for the simultaneous clustering problem. Secondly, we solved the optimization problem by linear search algorithm based on Stiefel manifold. Finally, we integrated the clustering results obtained from three omics by using k-nearest neighbor method. We applied this approach to four cancer datasets on TCGA. The result shows that our method is superior to several state-of-art methods, which depends on the hypothesis that the underlying omics cluster class is the same.

CONCLUSION

Particularly, our approach has better performance than compared approaches when the underlying clusters are inconsistent. For patients with different subtypes, both consistent and differential clusters can be identified at the same time.

摘要

背景

近年来,各种测序技术已被用于收集生物医学组学数据集。通常可以从单个患者样本中获得多种类型的组学数据。组学数据聚类在生物和医学研究中起着不可或缺的作用,有助于揭示来自多个集合的数据结构。然而,组学数据聚类包含许多挑战。组学数据分析中的主要挑战来自于数据的高维性和样本的小尺寸。因此,很难找到一种合适的方法来对多个数据集进行结构分析。

结果

本文提出了一种基于 Stiefel 流形的多视图聚类方法(MCSM)。MCSM 方法包括三个核心步骤。首先,我们建立了一个同时聚类问题的二进制优化模型。其次,我们基于 Stiefel 流形通过线性搜索算法求解了优化问题。最后,我们使用 K-最近邻方法整合了从三种组学中获得的聚类结果。我们在 TCGA 上的四个癌症数据集上应用了这种方法。结果表明,我们的方法优于几种最先进的方法,这取决于假设潜在的组学聚类类是相同的。

结论

特别是当潜在的聚类不一致时,我们的方法比比较方法具有更好的性能。对于具有不同亚型的患者,可以同时识别一致和差异聚类。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1b81/8152349/22321a7025ea/12859_2021_4195_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1b81/8152349/317b84b0f715/12859_2021_4195_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1b81/8152349/0fd29ac09314/12859_2021_4195_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1b81/8152349/af6c1d52cd1a/12859_2021_4195_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1b81/8152349/22321a7025ea/12859_2021_4195_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1b81/8152349/317b84b0f715/12859_2021_4195_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1b81/8152349/0fd29ac09314/12859_2021_4195_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1b81/8152349/af6c1d52cd1a/12859_2021_4195_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1b81/8152349/22321a7025ea/12859_2021_4195_Fig4_HTML.jpg

相似文献

1
Clustering of cancer data based on Stiefel manifold for multiple views.基于 Stiefel 流形的多视图癌症数据聚类。
BMC Bioinformatics. 2021 May 25;22(1):268. doi: 10.1186/s12859-021-04195-4.
2
Simultaneous clustering of multiview biomedical data using manifold optimization.基于流形优化的多视图生物医学数据的同步聚类。
Bioinformatics. 2019 Oct 15;35(20):4029-4037. doi: 10.1093/bioinformatics/btz217.
3
Multi-Manifold Optimization for Multi-View Subspace Clustering.用于多视图子空间聚类的多流形优化
IEEE Trans Neural Netw Learn Syst. 2022 Aug;33(8):3895-3907. doi: 10.1109/TNNLS.2021.3054789. Epub 2022 Aug 3.
4
Multi-view manifold regularized compact low-rank representation for cancer samples clustering on multi-omics data.基于多组学数据的癌症样本聚类的多视图流形正则化紧致低秩表示
BMC Bioinformatics. 2022 Jan 20;22(Suppl 12):334. doi: 10.1186/s12859-021-04220-6.
5
Fast dimension reduction and integrative clustering of multi-omics data using low-rank approximation: application to cancer molecular classification.使用低秩近似的多组学数据快速降维和整合聚类:在癌症分子分类中的应用
BMC Genomics. 2015 Dec 1;16:1022. doi: 10.1186/s12864-015-2223-8.
6
Convex Multi-View Clustering Via Robust Low Rank Approximation With Application to Multi-Omic Data.通过稳健低秩逼近的凸多视图聚类及其在多组学数据中的应用
IEEE/ACM Trans Comput Biol Bioinform. 2022 Nov-Dec;19(6):3340-3352. doi: 10.1109/TCBB.2021.3122961. Epub 2022 Dec 8.
7
Consensus guided incomplete multi-view spectral clustering.共识指导的不完全多视图谱聚类。
Neural Netw. 2021 Jan;133:207-219. doi: 10.1016/j.neunet.2020.10.014. Epub 2020 Nov 11.
8
A multiobjective multi-view cluster ensemble technique: Application in patient subclassification.一种多目标多视图聚类集成技术:在患者分类中的应用。
PLoS One. 2019 May 23;14(5):e0216904. doi: 10.1371/journal.pone.0216904. eCollection 2019.
9
Subtype identification from heterogeneous TCGA datasets on a genomic scale by multi-view clustering with enhanced consensus.通过具有增强一致性的多视图聚类,从基因组规模的异质TCGA数据集中进行亚型识别。
BMC Med Genomics. 2017 Dec 21;10(Suppl 4):75. doi: 10.1186/s12920-017-0306-x.
10
PathME: pathway based multi-modal sparse autoencoders for clustering of patient-level multi-omics data.PathME:基于通路的多模态稀疏自动编码器,用于对患者层面多组学数据进行聚类。
BMC Bioinformatics. 2020 Apr 16;21(1):146. doi: 10.1186/s12859-020-3465-2.

引用本文的文献

1
MOCSS: Multi-omics data clustering and cancer subtyping via shared and specific representation learning.MOCSS:通过共享和特定表示学习进行多组学数据聚类与癌症亚型分析
iScience. 2023 Jul 13;26(8):107378. doi: 10.1016/j.isci.2023.107378. eCollection 2023 Aug 18.

本文引用的文献

1
Multi-view clustering for multi-omics data using unified embedding.使用统一嵌入的多组学数据多视图聚类
Sci Rep. 2020 Aug 12;10(1):13654. doi: 10.1038/s41598-020-70229-1.
2
Simultaneous clustering of multiview biomedical data using manifold optimization.基于流形优化的多视图生物医学数据的同步聚类。
Bioinformatics. 2019 Oct 15;35(20):4029-4037. doi: 10.1093/bioinformatics/btz217.
3
Integrative cancer patient stratification via subspace merging.基于子空间合并的癌症患者综合分层。
Bioinformatics. 2019 May 15;35(10):1653-1659. doi: 10.1093/bioinformatics/bty866.
4
Multi-omic and multi-view clustering algorithms: review and cancer benchmark.多组学和多视角聚类算法:综述和癌症基准测试。
Nucleic Acids Res. 2018 Nov 16;46(20):10546-10562. doi: 10.1093/nar/gky889.
5
Identification of mutated driver pathways in cancer using a multi-objective optimization model.使用多目标优化模型识别癌症中的突变驱动通路。
Comput Biol Med. 2016 May 1;72:22-9. doi: 10.1016/j.compbiomed.2016.03.002. Epub 2016 Mar 10.
6
Identification of ovarian cancer subtype-specific network modules and candidate drivers through an integrative genomics approach.通过整合基因组学方法鉴定卵巢癌亚型特异性网络模块和候选驱动因子。
Oncotarget. 2016 Jan 26;7(4):4298-309. doi: 10.18632/oncotarget.6774.
7
Functional Module Analysis for Gene Coexpression Networks with Network Integration.基于网络整合的基因共表达网络功能模块分析
IEEE/ACM Trans Comput Biol Bioinform. 2015 Sep-Oct;12(5):1146-60. doi: 10.1109/TCBB.2015.2396073.
8
Similarity network fusion for aggregating data types on a genomic scale.基于基因组尺度聚合数据类型的相似网络融合。
Nat Methods. 2014 Mar;11(3):333-7. doi: 10.1038/nmeth.2810. Epub 2014 Jan 26.
9
Patient-specific data fusion defines prognostic cancer subtypes.个体化患者数据融合定义了预后癌症亚型。
PLoS Comput Biol. 2011 Oct;7(10):e1002227. doi: 10.1371/journal.pcbi.1002227. Epub 2011 Oct 20.
10
Tumor classification based on non-negative matrix factorization using gene expression data.基于基因表达数据的非负矩阵分解的肿瘤分类。
IEEE Trans Nanobioscience. 2011 Jun;10(2):86-93. doi: 10.1109/TNB.2011.2144998. Epub 2011 Jul 7.