Suppr超能文献

用于泛组学全癌分析的二维链接矩阵分解

BIDIMENSIONAL LINKED MATRIX FACTORIZATION FOR PAN-OMICS PAN-CANCER ANALYSIS.

作者信息

Lock Eric F, Park Jun Young, Hoadley Katherine A

机构信息

Division of Biostatistics, School of Public Health, University of Minnesota.

Department of Statistical Sciences, Faculty of Arts & Science, University of Toronto.

出版信息

Ann Appl Stat. 2022 Mar;16(1):193-215. doi: 10.1214/21-AOAS1495. Epub 2022 Mar 28.

Abstract

Several modern applications require the integration of multiple large data matrices that have shared rows and/or columns. For example, cancer studies that integrate multiple omics platforms across multiple types of cancer, , have extended our knowledge of molecular heterogeneity beyond what was observed in single tumor and single platform studies. However, these studies have been limited by available statistical methodology. We propose a flexible approach to the simultaneous factorization and decomposition of variation across such matrices, BIDIFAC+. BIDIFAC+ decomposes variation into a series of low-rank components that may be shared across any number of row sets (e.g., omics platforms) or column sets (e.g., cancer types). This builds on a growing literature for the factorization and decomposition of linked matrices which has primarily focused on multiple matrices that are linked in one dimension (rows or columns) only. Our objective function extends nuclear norm penalization, is motivated by random matrix theory, gives a unique decomposition under relatively mild conditions, and can be shown to give the mode of a Bayesian posterior distribution. We apply BIDIFAC+ to pan-omics pan-cancer data from TCGA, identifying shared and specific modes of variability across different omics platforms and 29 different cancer types.

摘要

一些现代应用需要整合多个具有共享行和/或列的大数据矩阵。例如,整合多种癌症类型的多个组学平台的癌症研究,已经扩展了我们对分子异质性的认识,超出了单肿瘤和单平台研究所观察到的范围。然而,这些研究受到现有统计方法的限制。我们提出了一种灵活的方法,用于同时对这类矩阵进行因子分解和变异分解,即BIDIFAC+。BIDIFAC+将变异分解为一系列低秩分量,这些分量可以在任意数量的行集(例如,组学平台)或列集(例如,癌症类型)之间共享。这建立在不断增长的关于链接矩阵因子分解和分解的文献基础上,这些文献主要关注仅在一个维度(行或列)上链接的多个矩阵。我们的目标函数扩展了核范数惩罚,受随机矩阵理论的启发,在相对温和的条件下给出唯一分解,并且可以证明它给出了贝叶斯后验分布的模式。我们将BIDIFAC+应用于来自TCGA的泛组学泛癌症数据,识别了不同组学平台和29种不同癌症类型之间共享和特定的变异模式。

相似文献

2
Integrative factorization of bidimensionally linked matrices.二维关联矩阵的综合分解。
Biometrics. 2020 Mar;76(1):61-74. doi: 10.1111/biom.13141. Epub 2019 Nov 10.
4
Linked matrix factorization.链接矩阵分解
Biometrics. 2019 Jun;75(2):582-592. doi: 10.1111/biom.13010. Epub 2019 Apr 2.
7
Bayesian Simultaneous Factorization and Prediction Using Multi-Omic Data.使用多组学数据的贝叶斯同时分解与预测
Comput Stat Data Anal. 2024 Sep;197. doi: 10.1016/j.csda.2024.107974. Epub 2024 Apr 30.

引用本文的文献

3
Empirical Bayes Linked Matrix Decomposition.经验贝叶斯链接矩阵分解
Mach Learn. 2024 Oct;113(10):7451-7477. doi: 10.1007/s10994-024-06599-8. Epub 2024 Aug 7.
4
Bayesian Simultaneous Factorization and Prediction Using Multi-Omic Data.使用多组学数据的贝叶斯同时分解与预测
Comput Stat Data Anal. 2024 Sep;197. doi: 10.1016/j.csda.2024.107974. Epub 2024 Apr 30.

本文引用的文献

1
Integrative factorization of bidimensionally linked matrices.二维关联矩阵的综合分解。
Biometrics. 2020 Mar;76(1):61-74. doi: 10.1111/biom.13141. Epub 2019 Nov 10.
2
Structural learning and integrative decomposition of multi-view data.多视图数据的结构学习与整合分解
Biometrics. 2019 Dec;75(4):1121-1132. doi: 10.1111/biom.13108. Epub 2019 Sep 15.
3
Linked matrix factorization.链接矩阵分解
Biometrics. 2019 Jun;75(2):582-592. doi: 10.1111/biom.13010. Epub 2019 Apr 2.
8
Clusternomics: Integrative context-dependent clustering for heterogeneous datasets.聚类组学:针对异构数据集的整合上下文相关聚类
PLoS Comput Biol. 2017 Oct 16;13(10):e1005781. doi: 10.1371/journal.pcbi.1005781. eCollection 2017 Oct.
10
Prediction With Dimension Reduction of Multiple Molecular Data Sources for Patient Survival.利用多分子数据源降维预测患者生存率
Cancer Inform. 2017 Jul 11;16:1176935117718517. doi: 10.1177/1176935117718517. eCollection 2017.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验