一种用于数据集成和疾病分型的新方法。

A novel approach for data integration and disease subtyping.

机构信息

Department of Computer Science and Engineering, University of Nevada, Reno, Nevada 89557, USA.

Department of Computer Science, Wayne State University, Detroit, Michigan 48202, USA.

出版信息

Genome Res. 2017 Dec;27(12):2025-2039. doi: 10.1101/gr.215129.116. Epub 2017 Oct 24.

DOI:10.1101/gr.215129.116

PMID:29066617

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5741060/

Abstract

Advances in high-throughput technologies allow for measurements of many types of omics data, yet the meaningful integration of several different data types remains a significant challenge. Another important and difficult problem is the discovery of molecular disease subtypes characterized by relevant clinical differences, such as survival. Here we present a novel approach, called erturbation clustering for data tegration and disease ubtyping (PINS), which is able to address both challenges. The framework has been validated on thousands of cancer samples, using gene expression, DNA methylation, noncoding microRNA, and copy number variation data available from the Gene Expression Omnibus, the Broad Institute, The Cancer Genome Atlas (TCGA), and the European Genome-Phenome Archive. This simultaneous subtyping approach accurately identifies known cancer subtypes and novel subgroups of patients with significantly different survival profiles. The results were obtained from genome-scale molecular data without any other type of prior knowledge. The approach is sufficiently general to replace existing unsupervised clustering approaches outside the scope of bio-medical research, with the additional ability to integrate multiple types of data.

摘要

高通量技术的进步使得可以测量许多类型的组学数据，但将几种不同类型的数据进行有意义的整合仍然是一个重大挑战。另一个重要且困难的问题是发现具有相关临床差异（如生存）的分子疾病亚型。在这里，我们提出了一种称为erturbation 聚类进行数据集成和疾病分型（PINS）的新方法，该方法能够解决这两个挑战。该框架已经在数千个癌症样本上进行了验证，使用了来自基因表达综合数据库、布罗德研究所、癌症基因组图谱（TCGA）和欧洲基因组-表型档案的基因表达、DNA 甲基化、非编码 microRNA 和拷贝数变异数据。这种同时进行亚型分类的方法能够准确识别已知的癌症亚型和具有显著不同生存特征的新型患者亚群。这些结果是从全基因组分子数据中获得的，而无需任何其他类型的先验知识。该方法足够通用，可以替代生物医学研究范围之外的现有无监督聚类方法，并具有整合多种类型数据的额外能力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4612/5741060/2916aefb40ab/2025f01.jpg

相似文献

A novel approach for data integration and disease subtyping.

Genome Res. 2017 Dec;27(12):2025-2039. doi: 10.1101/gr.215129.116. Epub 2017 Oct 24.

PartIES: a disease subtyping framework with Partition-level Integration using diffusion-Enhanced Similarities from multi-omics Data.

Brief Bioinform. 2024 Nov 22;26(1). doi: 10.1093/bib/bbae609.

Integrated Cancer Subtyping using Heterogeneous Genome-Scale Molecular Datasets.

Pac Symp Biocomput. 2020;25:551-562.

A unified graph model based on molecular data binning for disease subtyping.

J Biomed Inform. 2022 Oct;134:104187. doi: 10.1016/j.jbi.2022.104187. Epub 2022 Aug 30.

Integrative clustering reveals a novel split in the luminal A subtype of breast cancer with impact on outcome.

Breast Cancer Res. 2017 Mar 29;19(1):44. doi: 10.1186/s13058-017-0812-y.

Subtype identification from heterogeneous TCGA datasets on a genomic scale by multi-view clustering with enhanced consensus.

BMC Med Genomics. 2017 Dec 21;10(Suppl 4):75. doi: 10.1186/s12920-017-0306-x.

COPS: A novel platform for multi-omic disease subtype discovery via robust multi-objective evaluation of clustering algorithms.

PLoS Comput Biol. 2024 Aug 5;20(8):e1012275. doi: 10.1371/journal.pcbi.1012275. eCollection 2024 Aug.

Multi-omics integration with weighted affinity and self-diffusion applied for cancer subtypes identification.

J Transl Med. 2024 Jan 19;22(1):79. doi: 10.1186/s12967-024-04864-x.

Weighted dimensionality reduction and robust Gaussian mixture model based cancer patient subtyping from gene expression data.

J Biomed Inform. 2020 Dec;112:103620. doi: 10.1016/j.jbi.2020.103620. Epub 2020 Nov 11.

SMRT: Randomized Data Transformation for Cancer Subtyping and Big Data Analysis.

Front Oncol. 2021 Oct 20;11:725133. doi: 10.3389/fonc.2021.725133. eCollection 2021.

引用本文的文献

Multi-layer matrix factorization for cancer subtyping using full and partial multi-omics dataset.

Brief Bioinform. 2025 Aug 31;26(5). doi: 10.1093/bib/bbaf448.

A review on multi-omics integration for aiding study design of large scale TCGA cancer datasets.

BMC Genomics. 2025 Aug 22;26(1):769. doi: 10.1186/s12864-025-11925-y.

CLCluster: A redundancy-reduction contrastive learning-based clustering method of cancer subtype based on multi-omics data.

Mol Ther Nucleic Acids. 2025 Apr 2;36(2):102534. doi: 10.1016/j.omtn.2025.102534. eCollection 2025 Jun 10.

Generalized Probabilistic Canonical Correlation Analysis for Multi-modal Data Integration with Full or Partial Observations.

ArXiv. 2025 Apr 15:arXiv:2504.11610v1.

Multi-view multi-level contrastive graph convolutional network for cancer subtyping on multi-omics data.

Brief Bioinform. 2024 Nov 22;26(1). doi: 10.1093/bib/bbaf043.

PartIES: a disease subtyping framework with Partition-level Integration using diffusion-Enhanced Similarities from multi-omics Data.

Brief Bioinform. 2024 Nov 22;26(1). doi: 10.1093/bib/bbae609.

IPFMC: an iterative pathway fusion approach for enhanced multi-omics clustering in cancer research.

Brief Bioinform. 2024 Sep 23;25(6). doi: 10.1093/bib/bbae541.

Subtype-MGTP: a cancer subtype identification framework based on multi-omics translation.

Bioinformatics. 2024 Jun 3;40(6). doi: 10.1093/bioinformatics/btae360.

Elucidating Cancer Subtypes by Using the Relationship between DNA Methylation and Gene Expression.

Genes (Basel). 2024 May 16;15(5):631. doi: 10.3390/genes15050631.

A novel method for multiple phenotype association studies based on genotype and phenotype network.

PLoS Genet. 2024 May 10;20(5):e1011245. doi: 10.1371/journal.pgen.1011245. eCollection 2024 May.

本文引用的文献

Integrative phenotyping framework (iPF): integrative clustering of multiple omics data identifies novel lung disease subphenotypes.

BMC Genomics. 2015 Nov 11;16:924. doi: 10.1186/s12864-015-2170-4.

Integrative clinical genomics of advanced prostate cancer.

Cell. 2015 May 21;161(5):1215-1228. doi: 10.1016/j.cell.2015.05.001.

Comprehensive genomic characterization of head and neck squamous cell carcinomas.

Nature. 2015 Jan 29;517(7536):576-82. doi: 10.1038/nature14129.

How I treat mixed-phenotype acute leukemia.

Blood. 2015 Apr 16;125(16):2477-85. doi: 10.1182/blood-2014-10-551465. Epub 2015 Jan 20.

Colorectal cancer heterogeneity and targeted therapy: a case for molecular disease subtypes.

Cancer Res. 2015 Jan 15;75(2):245-9. doi: 10.1158/0008-5472.CAN-14-2240.

Methods of integrating data to uncover genotype-phenotype interactions.

Nat Rev Genet. 2015 Feb;16(2):85-97. doi: 10.1038/nrg3868. Epub 2015 Jan 13.

Integrated genomic characterization of papillary thyroid carcinoma.

Cell. 2014 Oct 23;159(3):676-90. doi: 10.1016/j.cell.2014.09.050.

The somatic genomic landscape of chromophobe renal cell carcinoma.

Cancer Cell. 2014 Sep 8;26(3):319-330. doi: 10.1016/j.ccr.2014.07.014. Epub 2014 Aug 21.

Multiplatform analysis of 12 cancer types reveals molecular classification within and across tissues of origin.

Cell. 2014 Aug 14;158(4):929-944. doi: 10.1016/j.cell.2014.06.049. Epub 2014 Aug 7.

Serine and glycine metabolism in cancer.

Trends Biochem Sci. 2014 Apr;39(4):191-8. doi: 10.1016/j.tibs.2014.02.004. Epub 2014 Mar 20.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

一种用于数据集成和疾病分型的新方法。

A novel approach for data integration and disease subtyping.

机构信息

Department of Computer Science and Engineering, University of Nevada, Reno, Nevada 89557, USA.

Department of Computer Science, Wayne State University, Detroit, Michigan 48202, USA.

出版信息

Genome Res. 2017 Dec;27(12):2025-2039. doi: 10.1101/gr.215129.116. Epub 2017 Oct 24.

DOI:10.1101/gr.215129.116

PMID:29066617

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5741060/

Abstract

摘要

一种用于数据集成和疾病分型的新方法。

A novel approach for data integration and disease subtyping.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

一种用于数据集成和疾病分型的新方法。

A novel approach for data integration and disease subtyping.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献