Suppr超能文献

负二项因子回归及其在微生物组数据分析中的应用。

Negative binomial factor regression with application to microbiome data analysis.

机构信息

Center for Computational Mathematics, Flatiron Institute, Simons Foundation, New York, New York, USA.

Department of Statistics, LMU München, Munich, Germany.

出版信息

Stat Med. 2022 Jul 10;41(15):2786-2803. doi: 10.1002/sim.9384. Epub 2022 Apr 24.

Abstract

The human microbiome provides essential physiological functions and helps maintain host homeostasis via the formation of intricate ecological host-microbiome relationships. While it is well established that the lifestyle of the host, dietary preferences, demographic background, and health status can influence microbial community composition and dynamics, robust generalizable associations between specific host-associated factors and specific microbial taxa have remained largely elusive. Here, we propose factor regression models that allow the estimation of structured parsimonious associations between host-related features and amplicon-derived microbial taxa. To account for the overdispersed nature of the amplicon sequencing count data, we propose negative binomial reduced rank regression (NB-RRR) and negative binomial co-sparse factor regression (NB-FAR). While NB-RRR encodes the underlying dependency among the microbial abundances as outcomes and the host-associated features as predictors through a rank-constrained coefficient matrix, NB-FAR uses a sparse singular value decomposition of the coefficient matrix. The latter approach avoids the notoriously difficult joint parameter estimation by extracting sparse unit-rank components of the coefficient matrix sequentially, effectively delivering interpretable bi-clusters of taxa and host-associated factors. To solve the nonconvex optimization problems associated with these factor regression models, we present a novel iterative block-wise majorization procedure. Extensive simulation studies and an application to the microbial abundance data from the American Gut Project (AGP) demonstrate the efficacy of the proposed procedure. In the AGP data, we identify several factors that strongly link dietary habits and host life style to specific microbial families.

摘要

人类微生物组通过形成复杂的生态宿主-微生物关系,为人体提供重要的生理功能,并帮助维持宿主内环境稳定。虽然已经确定宿主的生活方式、饮食偏好、人口统计学背景和健康状况会影响微生物群落的组成和动态,但特定宿主相关因素与特定微生物类群之间的稳健、可推广的关联仍然难以捉摸。在这里,我们提出了因子回归模型,可以估计宿主相关特征与扩增子衍生微生物类群之间的结构化简约关联。为了考虑扩增子测序计数数据的过度离散性质,我们提出了负二项式降秩回归(NB-RRR)和负二项式共稀疏因子回归(NB-FAR)。虽然 NB-RRR 通过一个秩约束系数矩阵将微生物丰度作为结果和宿主相关特征作为预测因子编码为潜在的依赖关系,但 NB-FAR 使用系数矩阵的稀疏奇异值分解。后者的方法通过依次提取系数矩阵的稀疏单位秩分量来避免联合参数估计的难题,有效地提供了可解释的类群和宿主相关特征的双聚类。为了解决这些因子回归模型相关的非凸优化问题,我们提出了一种新颖的迭代块极大化程序。广泛的模拟研究和对来自美国肠道计划(AGP)的微生物丰度数据的应用表明了所提出的程序的有效性。在 AGP 数据中,我们确定了几个因素,这些因素强烈地将饮食和宿主生活方式与特定的微生物家族联系起来。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/47b1/9325477/8a3180ed0e06/SIM-41-2786-g006.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验