Suppr超能文献

基于图引导的贝叶斯因子模型用于含噪声网络信息的多模态数据综合分析

Graph-guided Bayesian Factor Model for Integrative Analysis of Multi-modal Data with Noisy Network Information.

作者信息

Li Wenrui, Zhang Qiyiwen, Qu Kewen, Long Qi

机构信息

Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania, 423 Guardian Drive, Philadelphia, 19104, Pennsylvania, U.S.A..

出版信息

Stat Biosci. 2024 Aug 11. doi: 10.1007/s12561-024-09452-7.

Abstract

There is a growing body of literature on factor analysis that can capture individual and shared structures in multi-modal data. However, few of these approaches incorporate biological knowledge such as functional genomics and functional metabolomics. Graph-guided statistical learning methods that can incorporate knowledge of underlying networks have been shown to improve predication and classification accuracy, and yield more interpretable results. Moreover, these methods typically use graphs extracted from existing databases or rely on subject matter expertise which are known to be incomplete and may contain false edges. To address this gap, we propose a graph-guided Bayesian factor model that can account for network noise and identify globally shared, partially shared and modality-specific latent factors in multimodal data. Specifically, we use two sources of network information, including the noisy graph extracted from existing databases and the estimated graph from observed features in the dataset at hand, to inform the model for the true underlying network via a latent scale modeling framework. This model is coupled with the Bayesian factor analysis model with shrinkage priors to encourage feature-wise and modal-wise sparsity, thereby allowing feature selection and identification of factors of each type. We develop an efficient Markov chain Monte Carlo algorithm for posterior sampling. We demonstrate the advantages of our method over existing methods in simulations, and through analyses of gene expression and metabolomics datasets for Alzheimer's disease.

摘要

关于因子分析的文献越来越多,因子分析能够捕捉多模态数据中的个体和共享结构。然而,这些方法中很少有纳入诸如功能基因组学和功能代谢组学等生物学知识的。已证明能够纳入潜在网络知识的图引导统计学习方法可提高预测和分类准确性,并产生更具可解释性的结果。此外,这些方法通常使用从现有数据库中提取的图或依赖已知不完整且可能包含错误边的主题专业知识。为了弥补这一差距,我们提出了一种图引导贝叶斯因子模型,该模型可以考虑网络噪声,并识别多模态数据中的全局共享、部分共享和特定模态的潜在因子。具体而言,我们使用两种网络信息来源,包括从现有数据库中提取的有噪声图和从手头数据集中的观测特征估计的图,通过潜在尺度建模框架为模型提供关于真实潜在网络的信息。该模型与具有收缩先验的贝叶斯因子分析模型相结合,以鼓励特征维度和模态维度的稀疏性,从而允许进行特征选择并识别每种类型的因子。我们开发了一种用于后验采样的高效马尔可夫链蒙特卡罗算法。我们在模拟中以及通过对阿尔茨海默病的基因表达和代谢组学数据集的分析,证明了我们的方法相对于现有方法的优势。

相似文献

本文引用的文献

3
INTEGRATIVE NETWORK LEARNING FOR MULTI-MODALITY BIOMARKER DATA.多模态生物标志物数据的整合网络学习
Ann Appl Stat. 2021 Mar;15(1):64-87. doi: 10.1214/20-aoas1382. Epub 2021 Mar 18.
8
Zinc transporters in Alzheimer's disease.阿尔茨海默病中的锌转运体。
Mol Brain. 2019 Dec 9;12(1):106. doi: 10.1186/s13041-019-0528-2.
9
Structural learning and integrative decomposition of multi-view data.多视图数据的结构学习与整合分解
Biometrics. 2019 Dec;75(4):1121-1132. doi: 10.1111/biom.13108. Epub 2019 Sep 15.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验