Suppr超能文献

用于微生物组数据分析的核惩罚回归

KERNEL-PENALIZED REGRESSION FOR ANALYSIS OF MICROBIOME DATA.

作者信息

Randolph Timothy W, Zhao Sen, Copeland Wade, Hullar Meredith, Shojaie Ali

机构信息

Fred Hutchinson Cancer Research Center.

University of Washington.

出版信息

Ann Appl Stat. 2018 Mar;12(1):540-566. doi: 10.1214/17-AOAS1102. Epub 2018 Mar 9.

Abstract

The analysis of human microbiome data is often based on dimension-reduced graphical displays and clusterings derived from vectors of microbial abundances in each sample. Common to these ordination methods is the use of biologically motivated definitions of similarity. Principal coordinate analysis, in particular, is often performed using ecologically defined distances, allowing analyses to incorporate context-dependent, non-Euclidean structure. In this paper, we go beyond dimension-reduced ordination methods and describe a framework of high-dimensional regression models that extends these distance-based methods. In particular, we use kernel-based methods to show how to incorporate a variety of extrinsic information, such as phylogeny, into penalized regression models that estimate taxonspecific associations with a phenotype or clinical outcome. Further, we show how this regression framework can be used to address the compositional nature of multivariate predictors comprised of relative abundances; that is, vectors whose entries sum to a constant. We illustrate this approach with several simulations using data from two recent studies on gut and vaginal microbiomes. We conclude with an application to our own data, where we also incorporate a significance test for the estimated coefficients that represent associations between microbial abundance and a percent fat.

摘要

人类微生物组数据的分析通常基于降维图形显示和从每个样本中微生物丰度向量得出的聚类。这些排序方法的共同之处在于使用基于生物学动机的相似性定义。特别是主坐标分析,通常使用生态学定义的距离来进行,从而使分析能够纳入依赖于上下文的非欧几里得结构。在本文中,我们超越了降维排序方法,描述了一个高维回归模型框架,该框架扩展了这些基于距离的方法。具体而言,我们使用基于核的方法来展示如何将各种外部信息(如系统发育)纳入惩罚回归模型,这些模型估计与表型或临床结果的分类群特异性关联。此外,我们展示了这个回归框架如何用于处理由相对丰度组成的多元预测变量的组成性质;也就是说,其元素之和为常数的向量。我们使用来自最近两项关于肠道和阴道微生物组研究的数据进行了几次模拟来说明这种方法。我们以应用于我们自己的数据作为结尾,在那里我们还对代表微生物丰度与脂肪百分比之间关联的估计系数进行了显著性检验。

相似文献

1
KERNEL-PENALIZED REGRESSION FOR ANALYSIS OF MICROBIOME DATA.用于微生物组数据分析的核惩罚回归
Ann Appl Stat. 2018 Mar;12(1):540-566. doi: 10.1214/17-AOAS1102. Epub 2018 Mar 9.
3
Sufficient dimension reduction for compositional data.充分降维处理组合数据。
Biostatistics. 2021 Oct 13;22(4):687-705. doi: 10.1093/biostatistics/kxz060.

引用本文的文献

4
Supervised learning and model analysis with compositional data.基于组合数据的监督学习和模型分析。
PLoS Comput Biol. 2023 Jun 30;19(6):e1011240. doi: 10.1371/journal.pcbi.1011240. eCollection 2023 Jun.
5
Principal Amalgamation Analysis for Microbiome Data.微生物组数据的主成分融合分析。
Genes (Basel). 2022 Jun 24;13(7):1139. doi: 10.3390/genes13071139.

本文引用的文献

1
Measuring multivariate association and beyond.测量多元关联及其他。
Stat Surv. 2016;10:132-167. doi: 10.1214/16-SS116. Epub 2016 Nov 17.
2
A significance test for graph-constrained estimation.一种用于图形约束估计的显著性检验。
Biometrics. 2016 Jun;72(2):484-93. doi: 10.1111/biom.12418. Epub 2015 Sep 22.
4
Sparse and compositionally robust inference of microbial ecological networks.微生物生态网络的稀疏且成分稳健推断
PLoS Comput Biol. 2015 May 7;11(5):e1004226. doi: 10.1371/journal.pcbi.1004226. eCollection 2015 May.
5
Proportionality: a valid alternative to correlation for relative data.比例性:相对数据相关性的有效替代方法。
PLoS Comput Biol. 2015 Mar 16;11(3):e1004075. doi: 10.1371/journal.pcbi.1004075. eCollection 2015 Mar.
7
Conducting a microbiome study.进行微生物组研究。
Cell. 2014 Jul 17;158(2):250-262. doi: 10.1016/j.cell.2014.06.037.
9
Phylogeny-based classification of microbial communities.基于系统发育的微生物群落分类。
Bioinformatics. 2014 Feb 15;30(4):449-56. doi: 10.1093/bioinformatics/btt700. Epub 2013 Dec 24.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验