Suppr超能文献

使用多组学数据的贝叶斯同时分解与预测

Bayesian Simultaneous Factorization and Prediction Using Multi-Omic Data.

作者信息

Samorodnitsky Sarah, Wendt Chris H, Lock Eric F

机构信息

Division of Biostatistics, University of Minnesota, Minneapolis, 55455, MN, USA.

Fred Hutch Cancer Center, Seattle, 98109, WA, USA.

出版信息

Comput Stat Data Anal. 2024 Sep;197. doi: 10.1016/j.csda.2024.107974. Epub 2024 Apr 30.

Abstract

Integrative factorization methods for multi-omic data estimate factors explaining biological variation. Factors can be treated as covariates to predict an outcome and the factorization can be used to impute missing values. However, no available methods provide a comprehensive framework for statistical inference and uncertainty quantification for these tasks. A novel framework, Bayesian Simultaneous Factorization (BSF), is proposed to decompose multi-omics variation into joint and individual structures simultaneously within a probabilistic framework. BSF uses conjugate normal priors and the posterior mode of this model can be estimated by solving a structured nuclear norm-penalized objective that also achieves rank selection and motivates the choice of hyperparameters. BSF is then extended to simultaneously predict a continuous or binary phenotype while estimating latent factors, termed Bayesian Simultaneous Factorization and Prediction (BSFP). BSF and BSFP accommodate concurrent imputation, i.e., imputation during the model-fitting process, and full posterior inference for missing data, including "blockwise" missingness. It is shown via simulation that BSFP is competitive in recovering latent variation structure, and demonstrate the importance of accounting for uncertainty in the estimated factorization within the predictive model. The imputation performance of BSF is examined via simulation under missing-at-random and missing-not-at-random assumptions. Finally, BSFP is used to predict lung function based on the bronchoalveolar lavage metabolome and proteome from a study of HIV-associated obstructive lung disease, revealing multi-omic patterns related to lung function decline and a cluster of patients with obstructive lung disease driven by shared metabolomic and proteomic abundance patterns.

摘要

用于多组学数据的综合因子分解方法可估计解释生物变异的因子。这些因子可作为协变量用于预测结果,并且因子分解可用于插补缺失值。然而,目前没有可用的方法为这些任务提供一个全面的统计推断和不确定性量化框架。本文提出了一种新颖的框架——贝叶斯同步因子分解(BSF),用于在概率框架内将多组学变异同时分解为联合结构和个体结构。BSF使用共轭正态先验,并且该模型的后验模式可通过求解一个结构化核范数惩罚目标来估计,该目标还能实现秩选择并激发超参数的选择。然后,BSF被扩展为在估计潜在因子的同时预测连续或二元表型,称为贝叶斯同步因子分解与预测(BSFP)。BSF和BSFP支持并发插补,即在模型拟合过程中进行插补,以及对缺失数据进行完整的后验推断,包括“分块”缺失。通过模拟表明,BSFP在恢复潜在变异结构方面具有竞争力,并证明了在预测模型中考虑估计因子分解中的不确定性的重要性。在随机缺失和非随机缺失假设下,通过模拟检验了BSF的插补性能。最后,利用BSFP基于一项关于HIV相关阻塞性肺病的研究中的支气管肺泡灌洗代谢组和蛋白质组来预测肺功能,揭示了与肺功能下降相关的多组学模式以及由共享的代谢组和蛋白质组丰度模式驱动的一组阻塞性肺病患者。

相似文献

1
Bayesian Simultaneous Factorization and Prediction Using Multi-Omic Data.使用多组学数据的贝叶斯同时分解与预测
Comput Stat Data Anal. 2024 Sep;197. doi: 10.1016/j.csda.2024.107974. Epub 2024 Apr 30.
3
Anterior Approach Total Ankle Arthroplasty with Patient-Specific Cut Guides.使用患者特异性截骨导向器的前路全踝关节置换术。
JBJS Essent Surg Tech. 2025 Aug 15;15(3). doi: 10.2106/JBJS.ST.23.00027. eCollection 2025 Jul-Sep.

引用本文的文献

1
Empirical Bayes Linked Matrix Decomposition.经验贝叶斯链接矩阵分解
Mach Learn. 2024 Oct;113(10):7451-7477. doi: 10.1007/s10994-024-06599-8. Epub 2024 Aug 7.

本文引用的文献

3
sJIVE: Supervised Joint and Individual Variation Explained.sJIVE:监督联合与个体变异解释
Comput Stat Data Anal. 2022 Nov;175. doi: 10.1016/j.csda.2022.107547. Epub 2022 Jun 14.
4
Cooperative learning for multiview analysis.多视图分析的协同学习。
Proc Natl Acad Sci U S A. 2022 Sep 20;119(38):e2202113119. doi: 10.1073/pnas.2202113119. Epub 2022 Sep 12.
7
Neutrophils in chronic inflammatory diseases.慢性炎症性疾病中的中性粒细胞。
Cell Mol Immunol. 2022 Feb;19(2):177-191. doi: 10.1038/s41423-021-00832-3. Epub 2022 Jan 17.
9
Joint association and classification analysis of multi-view data.多视图数据的联合关联与分类分析
Biometrics. 2022 Dec;78(4):1614-1625. doi: 10.1111/biom.13536. Epub 2021 Aug 22.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验