• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

经验贝叶斯链接矩阵分解

Empirical Bayes Linked Matrix Decomposition.

作者信息

Lock Eric F

机构信息

Division of Biostatistics and Health Data Science, School of Public Health, University of Minnesota, Minneapolis, 55455, MN, USA.

出版信息

Mach Learn. 2024 Oct;113(10):7451-7477. doi: 10.1007/s10994-024-06599-8. Epub 2024 Aug 7.

DOI:10.1007/s10994-024-06599-8
PMID:39759800
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11698509/
Abstract

Data for several applications in diverse fields can be represented as multiple matrices that are linked across rows or columns. This is particularly common in molecular biomedical research, in which multiple molecular "omics" technologies may capture different feature sets (e.g., corresponding to rows in a matrix) and/or different sample populations (corresponding to columns). This has motivated a large body of work on integrative matrix factorization approaches that identify and decompose low-dimensional signal that is shared across multiple matrices or specific to a given matrix. We propose an empirical variational Bayesian approach to this problem that has several advantages over existing techniques, including the flexibility to accommodate shared signal over any number of row or column sets (i.e., bidimensional integration), an intuitive model-based objective function that yields appropriate shrinkage for the inferred signals, and a relatively efficient estimation algorithm with no tuning parameters. A general result establishes conditions for the uniqueness of the underlying decomposition for a broad family of methods that includes the proposed approach. For scenarios with missing data, we describe an associated iterative imputation approach that is novel for the single-matrix context and a powerful approach for "blockwise" imputation (in which an entire row or column is missing) in various linked matrix contexts. Extensive simulations show that the method performs very well under different scenarios with respect to recovering underlying low-rank signal, accurately decomposing shared and specific signals, and accurately imputing missing data. The approach is applied to gene expression and miRNA data from breast cancer tissue and normal breast tissue, for which it gives an informative decomposition of variation and outperforms alternative strategies for missing data imputation.

摘要

不同领域中多个应用的数据可以表示为跨行或跨列链接的多个矩阵。这在分子生物医学研究中尤为常见,其中多种分子“组学”技术可能会捕获不同的特征集(例如,对应于矩阵中的行)和/或不同的样本群体(对应于列)。这推动了大量关于整合矩阵分解方法的研究工作,这些方法用于识别和分解跨多个矩阵共享或特定于给定矩阵的低维信号。我们针对此问题提出了一种经验变分贝叶斯方法,该方法相对于现有技术具有多个优点,包括能够灵活适应任意数量的行集或列集上的共享信号(即二维整合)、基于直观模型的目标函数,该函数能对推断信号产生适当的收缩,以及一种无需调整参数的相对高效的估计算法。一个一般性结果为包括所提出方法在内的一大类方法的潜在分解唯一性建立了条件。对于存在缺失数据的情况,我们描述了一种相关的迭代插补方法,该方法在单矩阵情况下是新颖的,并且在各种链接矩阵情况下是用于“逐块”插补(其中整行或整列缺失) 的强大方法。广泛的模拟表明,该方法在恢复潜在低秩信号、准确分解共享和特定信号以及准确插补缺失数据方面,在不同场景下表现都非常出色。该方法应用于来自乳腺癌组织和正常乳腺组织的基因表达和miRNA数据,它对变异进行了有信息价值的分解,并且在缺失数据插补方面优于替代策略。

相似文献

1
Empirical Bayes Linked Matrix Decomposition.经验贝叶斯链接矩阵分解
Mach Learn. 2024 Oct;113(10):7451-7477. doi: 10.1007/s10994-024-06599-8. Epub 2024 Aug 7.
2
BIDIMENSIONAL LINKED MATRIX FACTORIZATION FOR PAN-OMICS PAN-CANCER ANALYSIS.用于泛组学全癌分析的二维链接矩阵分解
Ann Appl Stat. 2022 Mar;16(1):193-215. doi: 10.1214/21-AOAS1495. Epub 2022 Mar 28.
3
Linked matrix factorization.链接矩阵分解
Biometrics. 2019 Jun;75(2):582-592. doi: 10.1111/biom.13010. Epub 2019 Apr 2.
4
Integrative factorization of bidimensionally linked matrices.二维关联矩阵的综合分解。
Biometrics. 2020 Mar;76(1):61-74. doi: 10.1111/biom.13141. Epub 2019 Nov 10.
5
Multiple augmented reduced rank regression for pan-cancer analysis.多组增强降秩回归分析泛癌数据。
Biometrics. 2024 Jan 29;80(1). doi: 10.1093/biomtc/ujad002.
6
Multiple Augmented Reduced Rank Regression for Pan-Cancer Analysis.用于泛癌分析的多重增强降秩回归
ArXiv. 2023 Aug 30:arXiv:2308.16333v1.
7
BAMITA: Bayesian multiple imputation for tensor arrays.BAMITA:张量数组的贝叶斯多重填补法
Biostatistics. 2024 Dec 31;26(1). doi: 10.1093/biostatistics/kxae047.
8
Handling missing rows in multi-omics data integration: multiple imputation in multiple factor analysis framework.多组学数据整合中缺失行的处理:多因素分析框架下的多重填补
BMC Bioinformatics. 2016 Oct 3;17(1):402. doi: 10.1186/s12859-016-1273-5.
9
Bayesian Simultaneous Factorization and Prediction Using Multi-Omic Data.使用多组学数据的贝叶斯同时分解与预测
Comput Stat Data Anal. 2024 Sep;197. doi: 10.1016/j.csda.2024.107974. Epub 2024 Apr 30.
10
TRANSPOSABLE REGULARIZED COVARIANCE MODELS WITH AN APPLICATION TO MISSING DATA IMPUTATION.具有缺失数据插补应用的可转置正则化协方差模型。
Ann Appl Stat. 2010 Jun;4(2):764-790. doi: 10.1214/09-AOAS314.

本文引用的文献

1
Bayesian Simultaneous Factorization and Prediction Using Multi-Omic Data.使用多组学数据的贝叶斯同时分解与预测
Comput Stat Data Anal. 2024 Sep;197. doi: 10.1016/j.csda.2024.107974. Epub 2024 Apr 30.
2
Empirical Bayes Matrix Factorization.经验贝叶斯矩阵分解
J Mach Learn Res. 2021;22.
3
Hierarchical nuclear norm penalization for multi-view data integration.层次核范数惩罚多视图数据集成。
Biometrics. 2023 Dec;79(4):2933-2946. doi: 10.1111/biom.13893. Epub 2023 Jun 22.
4
BIDIMENSIONAL LINKED MATRIX FACTORIZATION FOR PAN-OMICS PAN-CANCER ANALYSIS.用于泛组学全癌分析的二维链接矩阵分解
Ann Appl Stat. 2022 Mar;16(1):193-215. doi: 10.1214/21-AOAS1495. Epub 2022 Mar 28.
5
Integrative factorization of bidimensionally linked matrices.二维关联矩阵的综合分解。
Biometrics. 2020 Mar;76(1):61-74. doi: 10.1111/biom.13141. Epub 2019 Nov 10.
6
Structural learning and integrative decomposition of multi-view data.多视图数据的结构学习与整合分解
Biometrics. 2019 Dec;75(4):1121-1132. doi: 10.1111/biom.13108. Epub 2019 Sep 15.
7
Structured Matrix Completion with Applications to Genomic Data Integration.结构化矩阵补全及其在基因组数据整合中的应用
J Am Stat Assoc. 2016;111(514):621-633. doi: 10.1080/01621459.2015.1021005. Epub 2016 Aug 18.
8
R.JIVE for exploration of multi-source molecular data.用于多源分子数据探索的R.JIVE
Bioinformatics. 2016 Sep 15;32(18):2877-9. doi: 10.1093/bioinformatics/btw324. Epub 2016 Jun 6.
9
A non-negative matrix factorization method for detecting modules in heterogeneous omics multi-modal data.一种用于在异质组学多模态数据中检测模块的非负矩阵分解方法。
Bioinformatics. 2016 Jan 1;32(1):1-8. doi: 10.1093/bioinformatics/btv544. Epub 2015 Sep 15.
10
Performing DISCO-SCA to search for distinctive and common information in linked data.执行 DISCO-SCA 以在关联数据中搜索独特和共同的信息。
Behav Res Methods. 2014 Jun;46(2):576-87. doi: 10.3758/s13428-013-0374-6.