Suppr超能文献

使用完整和部分多组学数据集进行癌症亚型分类的多层矩阵分解

Multi-layer matrix factorization for cancer subtyping using full and partial multi-omics dataset.

作者信息

Ren Yingxuan, Ren Fengtao, Yang Bo

机构信息

National University of Singapore, 119077, Singapore.

Department of Engineering, The Chinese University of Hong Kong, 999077, Hong Kong, China.

出版信息

Brief Bioinform. 2025 Aug 31;26(5). doi: 10.1093/bib/bbaf448.

Abstract

Cancer, with its inherent heterogeneity, is commonly categorized into distinct subtypes based on unique traits, cellular origins, and molecular markers specific to each type. However, current studies primarily rely on complete multi-omics datasets for predicting cancer subtypes, often overlooking predictive performance in cases where some omics data may be missing and neglecting implicit relationships across multiple layers of omics data integration. This paper introduces Multi-Layer Matrix Factorization (MLMF), a novel approach for cancer subtyping that employs multi-omics data clustering. MLMF initially processes multi-omics feature matrices by performing multi-layer linear or nonlinear factorization, decomposing the original data into latent feature representations unique to each omics type. These latent representations are subsequently fused into a consensus form, on which spectral clustering is performed to determine subtypes. Additionally, MLMF incorporates a class indicator matrix to handle missing omics data, creating a unified framework that can manage both complete and incomplete multi-omics data. Extensive experiments conducted on 12 multi-omics cancer datasets, both complete and with missing values, demonstrate that MLMF achieves results that are comparable to or surpass the performance of several state-of-the-art approaches. MLMF is open source and available at (https://github.com/renyingxuan/MLMF.git).

摘要

癌症因其内在的异质性,通常根据独特的特征、细胞起源以及每种类型特有的分子标记被分为不同的亚型。然而,当前的研究主要依赖完整的多组学数据集来预测癌症亚型,在一些组学数据可能缺失的情况下常常忽视预测性能,并且忽略了多组学数据整合多层之间的隐含关系。本文介绍了多层矩阵分解(MLMF),一种采用多组学数据聚类的新型癌症亚型分类方法。MLMF首先通过执行多层线性或非线性分解来处理多组学特征矩阵,将原始数据分解为每种组学类型特有的潜在特征表示。这些潜在表示随后被融合成一种共识形式,并在此基础上进行谱聚类以确定亚型。此外,MLMF纳入了一个类别指示矩阵来处理缺失的组学数据,创建了一个能够管理完整和不完整多组学数据的统一框架。在12个完整和有缺失值的多组学癌症数据集上进行的广泛实验表明,MLMF取得了与几种现有最先进方法相当或超越其性能的结果。MLMF是开源的,可在(https://github.com/renyingxuan/MLMF.git)获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f064/12418959/e782ce018ec0/bbaf448f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验