CluF：一种使用多组学数据进行患者分层的无监督迭代聚类融合方法。

CluF: an unsupervised iterative cluster-fusion method for patient stratification using multiomics data.

作者信息

Shakyawar Sushil K, Sajja Balasrinivasa R, Patel Jai Chand, Guda Chittibabu

机构信息

Department of Genetics, Cell Biology and Anatomy, University of Nebraska Medical Center, Omaha, NE 68198, United States.

Department of Radiology, University of Nebraska Medical Center, Omaha, NE 68198, United States.

出版信息

Bioinform Adv. 2024 Jan 30;4(1):vbae015. doi: 10.1093/bioadv/vbae015. eCollection 2024.

DOI:10.1093/bioadv/vbae015

PMID:38698887

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11063539/

Abstract

MOTIVATION

Patient stratification is crucial for the effective treatment or management of heterogeneous diseases, including cancers. Multiomic technologies facilitate molecular characterization of human diseases; however, the complexity of data warrants the need for the development of robust data integration tools for patient stratification using machine-learning approaches.

RESULTS

CluF iteratively integrates three types of multiomic data (mRNA, miRNA, and DNA methylation) using pairwise patient similarity matrices built from each omic data. The intermediate omic-specific neighborhood matrices implement iterative matrix fusion and message passing among the similarity matrices to derive a final integrated matrix representing all the omics profiles of a patient, which is used to further cluster patients into subtypes. CluF outperforms other methods with significant differences in the survival profiles of 8581 patients belonging to 30 different cancers in TCGA. CluF also predicted the four intrinsic subtypes of Breast Invasive Carcinomas with adjusted rand index and Fowlkes-Mallows scores of 0.72 and 0.83, respectively. The Gini importance score showed that methylation features were the primary decisive players, followed by mRNA and miRNA to identify disease subtypes. CluF can be applied to stratify patients with any disease containing multiomic datasets.

AVAILABILITY AND IMPLEMENTATION

Source code and datasets are available at https://github.com/GudaLab/iCluF_core.

摘要

动机

患者分层对于包括癌症在内的异质性疾病的有效治疗或管理至关重要。多组学技术有助于对人类疾病进行分子特征分析；然而，数据的复杂性使得有必要开发强大的数据集成工具，以便使用机器学习方法进行患者分层。

结果

CluF使用从每个组学数据构建的成对患者相似性矩阵，迭代地整合三种类型的多组学数据（mRNA、miRNA和DNA甲基化）。中间的组学特异性邻域矩阵在相似性矩阵之间实现迭代矩阵融合和消息传递，以得出代表患者所有组学概况的最终整合矩阵，该矩阵用于进一步将患者聚类为不同亚型。在TCGA中，CluF在属于30种不同癌症的8581名患者的生存概况方面显著优于其他方法。CluF还预测了乳腺浸润性癌的四种内在亚型，调整后的兰德指数和福克尔斯 - 马洛斯分数分别为0.72和0.83。基尼重要性分数表明，甲基化特征是主要的决定性因素，其次是mRNA和miRNA，用于识别疾病亚型。CluF可应用于对任何包含多组学数据集的疾病患者进行分层。

可用性和实现方式

源代码和数据集可在https://github.com/GudaLab/iCluF_core获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/095b/11063539/a033335d4f24/vbae015f1.jpg

相似文献

CluF: an unsupervised iterative cluster-fusion method for patient stratification using multiomics data.

Bioinform Adv. 2024 Jan 30;4(1):vbae015. doi: 10.1093/bioadv/vbae015. eCollection 2024.

Cancer subtype identification by consensus guided graph autoencoders.

Bioinformatics. 2021 Dec 11;37(24):4779-4786. doi: 10.1093/bioinformatics/btab535.

Multi-Omic Graph Diagnosis (MOGDx): a data integration tool to perform classification tasks for heterogeneous diseases.

Bioinformatics. 2024 Sep 2;40(9). doi: 10.1093/bioinformatics/btae523.

NEMO: cancer subtyping by integration of partial multi-omic data.

Bioinformatics. 2019 Sep 15;35(18):3348-3356. doi: 10.1093/bioinformatics/btz058.

Integrative cancer patient stratification via subspace merging.

Bioinformatics. 2019 May 15;35(10):1653-1659. doi: 10.1093/bioinformatics/bty866.

Deep multi-omics integration by learning correlation-maximizing representation identifies prognostically stratified cancer subtypes.

Bioinform Adv. 2023 Jun 21;3(1):vbad075. doi: 10.1093/bioadv/vbad075. eCollection 2023.

Supervised Graph Clustering for Cancer Subtyping Based on Survival Analysis and Integration of Multi-Omic Tumor Data.

IEEE/ACM Trans Comput Biol Bioinform. 2022 Mar-Apr;19(2):1193-1202. doi: 10.1109/TCBB.2020.3010509. Epub 2022 Apr 1.

Multi-omics data integration for subtype identification of Chinese lower-grade gliomas: A joint similarity network fusion approach.

Comput Struct Biotechnol J. 2022 Jul 2;20:3482-3492. doi: 10.1016/j.csbj.2022.06.065. eCollection 2022.

Probabilistic tensor decomposition extracts better latent embeddings from single-cell multiomic data.

Nucleic Acids Res. 2023 Aug 25;51(15):e81. doi: 10.1093/nar/gkad570.

MultiBaC: A strategy to remove batch effects between different omic data types.

Stat Methods Med Res. 2020 Oct;29(10):2851-2864. doi: 10.1177/0962280220907365. Epub 2020 Mar 4.

引用本文的文献

Machine learning based screening of biomarkers associated with cell death and immunosuppression of multiple life stages sepsis populations.

Sci Rep. 2025 Aug 19;15(1):30302. doi: 10.1038/s41598-025-14600-0.

Multiomics Signature Reveals Network Regulatory Mechanisms in a CRC Continuum.

Int J Mol Sci. 2025 Jul 23;26(15):7077. doi: 10.3390/ijms26157077.

COVID-19 risk stratification among older adults: a machine learning approach to identify personal and health-related risk factors.

BMC Public Health. 2025 Jul 29;25(1):2577. doi: 10.1186/s12889-025-23862-2.

Ovarian Cancer: Multi-Omics Data Integration.

Int J Mol Sci. 2025 Jun 21;26(13):5961. doi: 10.3390/ijms26135961.

Integrating multi-omics data to optimize immunotherapy in endometrial cancer: a comprehensive study.

Discov Oncol. 2025 Jun 20;16(1):1161. doi: 10.1007/s12672-025-02978-2.

GAIN-BRCA: a graph-based AI-net framework for breast cancer subtype classification using multiomics data.

Bioinform Adv. 2025 May 14;5(1):vbaf116. doi: 10.1093/bioadv/vbaf116. eCollection 2025.

Advancing precision oncology with AI-powered genomic analysis.

Front Pharmacol. 2025 Apr 30;16:1591696. doi: 10.3389/fphar.2025.1591696. eCollection 2025.

From Stress to Synapse: The Neuronal Atrophy Pathway to Mood Dysregulation.

Int J Mol Sci. 2025 Mar 30;26(7):3219. doi: 10.3390/ijms26073219.

本文引用的文献

Multi-omics integration analysis of GPCRs in pan-cancer to uncover inter-omics relationships and potential driver genes.

Comput Biol Med. 2023 Jul;161:106988. doi: 10.1016/j.compbiomed.2023.106988. Epub 2023 May 11.

Multi-omics analysis reveals a molecular landscape of the early recurrence and early metastasis in pan-cancer.

Front Genet. 2023 Apr 20;14:1061364. doi: 10.3389/fgene.2023.1061364. eCollection 2023.

Predicting breast cancer types on and beyond molecular level in a multi-modal fashion.

NPJ Breast Cancer. 2023 Mar 22;9(1):16. doi: 10.1038/s41523-023-00517-2.

Discordance between PAM50 intrinsic subtyping and immunohistochemistry in South African women with breast cancer.

Breast Cancer Res Treat. 2023 May;199(1):1-12. doi: 10.1007/s10549-023-06886-3. Epub 2023 Mar 3.

A Review of Feature Selection Methods for Machine Learning-Based Disease Risk Prediction.

Front Bioinform. 2022 Jun 27;2:927312. doi: 10.3389/fbinf.2022.927312. eCollection 2022.

mintRULS: Prediction of miRNA-mRNA Target Site Interactions Using Regularized Least Square Method.

Genes (Basel). 2022 Aug 25;13(9):1528. doi: 10.3390/genes13091528.

Integrative clustering methods for multi-omics data.

Wiley Interdiscip Rev Comput Stat. 2022 May-Jun;14(3). doi: 10.1002/wics.1553. Epub 2021 Feb 7.

Unsupervised Multi-Omics Data Integration Methods: A Comprehensive Review.

Front Genet. 2022 Mar 22;13:854752. doi: 10.3389/fgene.2022.854752. eCollection 2022.

Predicting Breast Cancer Gene Expression Signature by Applying Deep Convolutional Neural Networks From Unannotated Pathological Images.

Front Oncol. 2021 Dec 1;11:769447. doi: 10.3389/fonc.2021.769447. eCollection 2021.

Predicting miRNA-disease associations using improved random walk with restart and integrating multiple similarities.

Sci Rep. 2021 Oct 26;11(1):21071. doi: 10.1038/s41598-021-00677-w.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

CluF：一种使用多组学数据进行患者分层的无监督迭代聚类融合方法。

CluF: an unsupervised iterative cluster-fusion method for patient stratification using multiomics data.

作者信息

Shakyawar Sushil K, Sajja Balasrinivasa R, Patel Jai Chand, Guda Chittibabu

机构信息

Department of Genetics, Cell Biology and Anatomy, University of Nebraska Medical Center, Omaha, NE 68198, United States.

Department of Radiology, University of Nebraska Medical Center, Omaha, NE 68198, United States.