用于高维分子数据的整合聚类方法

Integrative clustering methods for high-dimensional molecular data.

作者信息

Chalise Prabhakar, Koestler Devin C, Bimali Milan, Yu Qing, Fridley Brooke L

机构信息

Department of Biostatistics, University of Kansas Medical Center, Kansas City, KS 66160, USA.

出版信息

Transl Cancer Res. 2014 Jun 1;3(3):202-216. doi: 10.3978/j.issn.2218-676X.2014.06.03.

DOI:10.3978/j.issn.2218-676X.2014.06.03

PMID:25243110

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4166480/

Abstract

High-throughput 'omic' data, such as gene expression, DNA methylation, DNA copy number, has played an instrumental role in furthering our understanding of the molecular basis in states of human health and disease. As cells with similar morphological characteristics can exhibit entirely different molecular profiles and because of the potential that these discrepancies might further our understanding of patient-level variability in clinical outcomes, there is significant interest in the use of high-throughput 'omic' data for the identification of novel molecular subtypes of a disease. While numerous clustering methods have been proposed for identifying of molecular subtypes, most were developed for single "omic' data types and may not be appropriate when more than one 'omic' data type are collected on study subjects. Given that complex diseases, such as cancer, arise as a result of genomic, epigenomic, transcriptomic, and proteomic alterations, integrative clustering methods for the simultaneous clustering of multiple 'omic' data types have great potential to aid in molecular subtype discovery. Traditionally, ad hoc manual data integration has been performed using the results obtained from the clustering of individual 'omic' data types on the same set of patient samples. However, such methods often result in inconsistent assignment of subjects to the molecular cancer subtypes. Recently, several methods have been proposed in the literature that offers a rigorous framework for the simultaneous integration of multiple 'omic' data types in a single comprehensive analysis. In this paper, we present a systematic review of existing integrative clustering methods.

摘要

高通量“组学”数据，如基因表达、DNA甲基化、DNA拷贝数，在加深我们对人类健康和疾病状态分子基础的理解方面发挥了重要作用。由于具有相似形态特征的细胞可能表现出完全不同的分子谱，并且鉴于这些差异可能有助于我们理解临床结果中患者水平的变异性，因此人们对使用高通量“组学”数据来识别疾病的新型分子亚型有着浓厚兴趣。虽然已经提出了许多聚类方法来识别分子亚型，但大多数是为单一“组学”数据类型开发的，当在研究对象上收集不止一种“组学”数据类型时可能并不适用。鉴于诸如癌症等复杂疾病是由基因组、表观基因组、转录组和蛋白质组改变引起的，用于同时对多种“组学”数据类型进行聚类的综合聚类方法在辅助分子亚型发现方面具有巨大潜力。传统上，临时手动数据整合是使用从同一组患者样本的单个“组学”数据类型聚类中获得的结果来进行的。然而，此类方法常常导致将受试者不一致地分配到分子癌症亚型中。最近，文献中提出了几种方法，它们为在单一综合分析中同时整合多种“组学”数据类型提供了一个严格的框架。在本文中，我们对现有的综合聚类方法进行了系统综述。

相似文献

Integrative clustering methods for high-dimensional molecular data.

Transl Cancer Res. 2014 Jun 1;3(3):202-216. doi: 10.3978/j.issn.2218-676X.2014.06.03.

Integrative clustering of multi-level 'omic data based on non-negative matrix factorization algorithm.

PLoS One. 2017 May 1;12(5):e0176278. doi: 10.1371/journal.pone.0176278. eCollection 2017.

Multi-omics data fusion using adaptive GTO guided Non-negative matrix factorization for cancer subtype discovery.

Comput Methods Programs Biomed. 2023 Jan;228:107246. doi: 10.1016/j.cmpb.2022.107246. Epub 2022 Nov 16.

Integrative 'omic' approach towards understanding the nature of human diseases.

Balkan J Med Genet. 2012 Dec;15(Suppl):45-50. doi: 10.2478/v10034-012-0018-7.

Multi-omic integration via similarity network fusion to detect molecular subtypes of ageing.

Brain Commun. 2023 Apr 4;5(2):fcad110. doi: 10.1093/braincomms/fcad110. eCollection 2023.

Integrative clustering of multiple genomic data types using a joint latent variable model with application to breast and lung cancer subtype analysis.

Bioinformatics. 2009 Nov 15;25(22):2906-12. doi: 10.1093/bioinformatics/btp543. Epub 2009 Sep 16.

Single-platform 'multi-omic' profiling: unified mass spectrometry and computational workflows for integrative proteomics-metabolomics analysis.

Mol Omics. 2018 Oct 8;14(5):307-319. doi: 10.1039/c8mo00136g.

Supervised Graph Clustering for Cancer Subtyping Based on Survival Analysis and Integration of Multi-Omic Tumor Data.

IEEE/ACM Trans Comput Biol Bioinform. 2022 Mar-Apr;19(2):1193-1202. doi: 10.1109/TCBB.2020.3010509. Epub 2022 Apr 1.

Network-based integrative clustering of multiple types of genomic data using non-negative matrix factorization.

Multilevel omic data clustering reveals variable contribution of methylator phenotype to integrative cancer subtypes.

Epigenomics. 2018 Oct;10(10):1289-1299. doi: 10.2217/epi-2018-0057. Epub 2018 Jun 13.

引用本文的文献

AutoFocus: a hierarchical framework to explore multi-omic disease associations spanning multiple scales of biomolecular interaction.

Commun Biol. 2024 Sep 6;7(1):1094. doi: 10.1038/s42003-024-06724-2.

Statistical Methods for Integrative Clustering of Multi-omics Data.

Methods Mol Biol. 2023;2629:73-93. doi: 10.1007/978-1-0716-2986-4_5.

Combining heterogeneous subgroups with graph-structured variable selection priors for Cox regression.

BMC Bioinformatics. 2021 Dec 11;22(1):586. doi: 10.1186/s12859-021-04483-z.

Weighted Cox regression for the prediction of heterogeneous patient subgroups.

BMC Med Inform Decis Mak. 2021 Dec 7;21(1):342. doi: 10.1186/s12911-021-01698-1.

PIntMF: Penalized Integrative Matrix Factorization method for multi-omics data.

Bioinformatics. 2022 Jan 27;38(4):900-907. doi: 10.1093/bioinformatics/btab786.

Bayesian integrative analysis and prediction with application to atherosclerosis cardiovascular disease.

Biostatistics. 2022 Dec 12;24(1):124-139. doi: 10.1093/biostatistics/kxab016.

Multi-dimensional data integration algorithm based on random walk with restart.

BMC Bioinformatics. 2021 Feb 27;22(1):97. doi: 10.1186/s12859-021-04029-3.

A Multi-Objective Approach for Anti-Osteosarcoma Cancer Agents Discovery through Drug Repurposing.

Pharmaceuticals (Basel). 2020 Nov 22;13(11):409. doi: 10.3390/ph13110409.

Vertical integration methods for gene expression data analysis.

Brief Bioinform. 2021 May 20;22(3). doi: 10.1093/bib/bbaa169.

Network-based integrative clustering of multiple types of genomic data using non-negative matrix factorization.

本文引用的文献

SPARSE INTEGRATIVE CLUSTERING OF MULTIPLE OMICS DATA SETS.

Ann Appl Stat. 2013 Apr 9;7(1):269-294. doi: 10.1214/12-AOAS578.

Discovery of multi-dimensional modules by integrative analysis of cancer genomic data.

Nucleic Acids Res. 2012 Oct;40(19):9379-91. doi: 10.1093/nar/gks725. Epub 2012 Aug 8.

Semi-supervised recursively partitioned mixture models for identifying cancer subtypes.

Bioinformatics. 2010 Oct 15;26(20):2578-85. doi: 10.1093/bioinformatics/btq470. Epub 2010 Aug 16.

Model-based clustering of microarray expression data via latent Gaussian mixture models.

Bioinformatics. 2010 Nov 1;26(21):2705-12. doi: 10.1093/bioinformatics/btq498. Epub 2010 Aug 29.

Comprehensive profiling of DNA methylation in colorectal cancer reveals subgroups with distinct clinicopathological and molecular features.

BMC Cancer. 2010 May 21;10:227. doi: 10.1186/1471-2407-10-227.

Integrated genomic analysis identifies clinically relevant subtypes of glioblastoma characterized by abnormalities in PDGFRA, IDH1, EGFR, and NF1.

Cancer Cell. 2010 Jan 19;17(1):98-110. doi: 10.1016/j.ccr.2009.12.020.

Integrative clustering of multiple genomic data types using a joint latent variable model with application to breast and lung cancer subtype analysis.

Bioinformatics. 2009 Nov 15;25(22):2906-12. doi: 10.1093/bioinformatics/btp543. Epub 2009 Sep 16.

MULTI-K: accurate classification of microarray subtypes using ensemble k-means clustering.

BMC Bioinformatics. 2009 Aug 22;10:260. doi: 10.1186/1471-2105-10-260.

Epigenetic profiling reveals etiologically distinct patterns of DNA methylation in head and neck squamous cell carcinoma.

Carcinogenesis. 2009 Mar;30(3):416-22. doi: 10.1093/carcin/bgp006. Epub 2009 Jan 6.

Meta-analysis of gene expression profiles in breast cancer: toward a unified understanding of breast cancer subtyping and prognosis signatures.

Breast Cancer Res. 2008;10(4):R65. doi: 10.1186/bcr2124. Epub 2008 Jul 28.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

用于高维分子数据的整合聚类方法

Integrative clustering methods for high-dimensional molecular data.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献