通过正则化方法识别重要的回归组、亚组和个体：在肠道微生物组数据中的应用。

Identification of important regressor groups, subgroups and individuals via regularization methods: application to gut microbiome data.

机构信息

Department of Epidemiology & Biostatistics, School of Rural Public Health, Texas A&M Health Science Center, College Station, TX 77843-1266, USA, School of Mathematics and Statistics, University of Sydney, NSW 2006 Australia, Department of Statistics, Texas A&M University, College Station, TX 77843-3143, USA and Department of Poultry Science, Intercollegiate Faculty of Nutrition, Texas A&M University, College Station, TX 77840, USA.

出版信息

Bioinformatics. 2014 Mar 15;30(6):831-7. doi: 10.1093/bioinformatics/btt608. Epub 2013 Oct 24.

DOI:10.1093/bioinformatics/btt608

PMID:24162467

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3957069/

Abstract

MOTIVATION

Gut microbiota can be classified at multiple taxonomy levels. Strategies to use changes in microbiota composition to effect health improvements require knowing at which taxonomy level interventions should be aimed. Identifying these important levels is difficult, however, because most statistical methods only consider when the microbiota are classified at one taxonomy level, not multiple.

RESULTS

Using L1 and L2 regularizations, we developed a new variable selection method that identifies important features at multiple taxonomy levels. The regularization parameters are chosen by a new, data-adaptive, repeated cross-validation approach, which performed well. In simulation studies, our method outperformed competing methods: it more often selected significant variables, and had small false discovery rates and acceptable false-positive rates. Applying our method to gut microbiota data, we found which taxonomic levels were most altered by specific interventions or physiological status.

AVAILABILITY

The new approach is implemented in an R package, which is freely available from the corresponding author.

CONTACT

tpgarcia@srph.tamhsc.edu

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

肠道微生物群可以在多个分类学水平上进行分类。利用微生物群组成的变化来促进健康的策略需要知道干预措施应该针对哪个分类学水平。然而，确定这些重要水平是困难的，因为大多数统计方法只考虑将微生物群分类在一个分类学水平上，而不是多个水平。

结果

我们使用 L1 和 L2 正则化方法开发了一种新的变量选择方法，可以在多个分类学水平上识别重要特征。正则化参数是通过一种新的、数据自适应的、重复交叉验证方法选择的，该方法表现良好。在模拟研究中，我们的方法优于竞争方法：它更经常选择显著的变量，并且具有较小的假发现率和可接受的假阳性率。将我们的方法应用于肠道微生物组数据，我们发现了特定干预或生理状态最能改变哪些分类学水平。

可用性

新方法在一个 R 包中实现，可从通讯作者处免费获得。

联系方式

tpgarcia@srph.tamhsc.edu

补充信息

补充资料可在《生物信息学》在线获取。

相似文献

Identification of important regressor groups, subgroups and individuals via regularization methods: application to gut microbiome data.通过正则化方法识别重要的回归组、亚组和个体：在肠道微生物组数据中的应用。

Bioinformatics. 2014 Mar 15;30(6):831-7. doi: 10.1093/bioinformatics/btt608. Epub 2013 Oct 24.

A distance-based approach for testing the mediation effect of the human microbiome.基于距离的方法检验人类微生物组的中介效应

Bioinformatics. 2018 Jun 1;34(11):1875-1883. doi: 10.1093/bioinformatics/bty014.

Selection of models for the analysis of risk-factor trees: leveraging biological knowledge to mine large sets of risk factors with application to microbiome data.用于风险因素树分析的模型选择：利用生物学知识挖掘大量风险因素并应用于微生物组数据

Bioinformatics. 2015 May 15;31(10):1607-13. doi: 10.1093/bioinformatics/btu855. Epub 2015 Jan 6.

The mechanistic link between health and gut microbiota diversity.健康与肠道微生物多样性之间的机制联系。

Sci Rep. 2018 Feb 1;8(1):2183. doi: 10.1038/s41598-018-20141-6.

PERMANOVA-S: association test for microbial community composition that accommodates confounders and multiple distances.PERMANOVA-S：用于微生物群落组成的关联测试，可处理混杂因素和多种距离。

Bioinformatics. 2016 Sep 1;32(17):2618-25. doi: 10.1093/bioinformatics/btw311. Epub 2016 May 19.

The gut microbiome modulates colon tumorigenesis.肠道微生物组调节结肠肿瘤发生。

mBio. 2013 Nov 5;4(6):e00692-13. doi: 10.1128/mBio.00692-13.

MetaCoMET: a web platform for discovery and visualization of the core microbiome.MetaCoMET：一个用于发现和可视化核心微生物组的网络平台。

Bioinformatics. 2016 Nov 15;32(22):3469-3470. doi: 10.1093/bioinformatics/btw507. Epub 2016 Aug 2.

Differences in Gut Microbiome Composition between Senior Orienteering Athletes and Community-Dwelling Older Adults.老年人与社区老年人在肠道微生物组成上的差异。

Nutrients. 2020 Aug 27;12(9):2610. doi: 10.3390/nu12092610.

Prebiotic effects: metabolic and health benefits.益生元作用：代谢与健康益处。

Br J Nutr. 2010 Aug;104 Suppl 2:S1-63. doi: 10.1017/S0007114510003363.

Sparse least trimmed squares regression with compositional covariates for high-dimensional data.基于成分协变量的高维数据稀疏最小 trimmed 方回归。

Bioinformatics. 2021 Nov 5;37(21):3805-3814. doi: 10.1093/bioinformatics/btab572.

引用本文的文献

DeepBiome: A Phylogenetic Tree Informed Deep Neural Network for Microbiome Data Analysis.深度生物群落：一种基于系统发育树的深度神经网络用于微生物组数据分析。

Stat Biosci. 2025 Apr;17(1):191-215. doi: 10.1007/s12561-024-09434-9. Epub 2024 Jun 14.

DART: Distance Assisted Recursive Testing.DART：距离辅助递归测试。

J Mach Learn Res. 2023;24.

Modeling the Cholesky factors of covariance matrices of multivariate longitudinal data.对多元纵向数据协方差矩阵的乔列斯基分解因子进行建模。

J Multivar Anal. 2016 Mar;145:87-100. doi: 10.1016/j.jmva.2015.11.014. Epub 2015 Dec 14.

Principal Amalgamation Analysis for Microbiome Data.微生物组数据的主成分融合分析。

Genes (Basel). 2022 Jun 24;13(7):1139. doi: 10.3390/genes13071139.

COX REGRESSION WITH EXCLUSION FREQUENCY-BASED WEIGHTS TO IDENTIFY NEUROIMAGING MARKERS RELEVANT TO HUNTINGTON'S DISEASE ONSET.使用基于排除频率的权重进行 Cox 回归以识别与亨廷顿病发病相关的神经影像标志物。

Ann Appl Stat. 2016 Dec;10(4):2130-2156. doi: 10.1214/16-aoas967. Epub 2017 Jan 5.

Feature selection and causal analysis for microbiome studies in the presence of confounding using standardization.基于标准化的混杂因素校正方法在微生物组学研究中的特征选择和因果分析

BMC Bioinformatics. 2021 Jul 6;22(1):362. doi: 10.1186/s12859-021-04232-2.

MicroBVS: Dirichlet-tree multinomial regression models with Bayesian variable selection - an R package.MicroBVS：带贝叶斯变量选择的 Dirichlet 树多项回归模型 - R 包。

BMC Bioinformatics. 2020 Jul 13;21(1):301. doi: 10.1186/s12859-020-03640-0.

Incorporating Phylogenetic Information in Microbiome Differential Abundance Studies Has No Effect on Detection Power and FDR Control.在微生物组差异丰度研究中纳入系统发育信息对检测效能和错误发现率控制没有影响。

Front Microbiol. 2020 Apr 15;11:649. doi: 10.3389/fmicb.2020.00649. eCollection 2020.

A Phylogeny-Regularized Sparse Regression Model for Predictive Modeling of Microbial Community Data.一种用于微生物群落数据预测建模的系统发育正则化稀疏回归模型。

Front Microbiol. 2018 Dec 19;9:3112. doi: 10.3389/fmicb.2018.03112. eCollection 2018.

Predictive Modeling of Microbiome Data Using a Phylogeny-Regularized Generalized Linear Mixed Model.使用系统发育正则化广义线性混合模型对微生物组数据进行预测建模。

Front Microbiol. 2018 Jun 27;9:1391. doi: 10.3389/fmicb.2018.01391. eCollection 2018.

本文引用的文献

Gut metagenome in European women with normal, impaired and diabetic glucose control.肠道宏基因组与欧洲女性正常、受损和糖尿病患者的葡萄糖控制。

Nature. 2013 Jun 6;498(7452):99-103. doi: 10.1038/nature12198. Epub 2013 May 29.

Structured variable selection with q-values.基于 q 值的结构化变量选择。

Biostatistics. 2013 Sep;14(4):695-707. doi: 10.1093/biostatistics/kxt012. Epub 2013 Apr 10.

A dairy-based high calcium diet improves glucose homeostasis and reduces steatosis in the context of preexisting obesity.以乳制品为基础的高钙饮食可改善肥胖患者的葡萄糖稳态并减少脂肪变性。

Obesity (Silver Spring). 2013 Mar;21(3):E229-35. doi: 10.1002/oby.20039.

Gut microbiomes of Malawian twin pairs discordant for kwashiorkor.马拉维双胞胎中库普弗细胞营养不良症的肠道微生物组。

Science. 2013 Feb 1;339(6119):548-54. doi: 10.1126/science.1229000. Epub 2013 Jan 30.

Murine gut microbiota and transcriptome are diet dependent.鼠类肠道微生物组和转录组是依赖于饮食的。

Ann Surg. 2013 Feb;257(2):287-94. doi: 10.1097/SLA.0b013e318262a6a6.

Diversity, stability and resilience of the human gut microbiota.人类肠道微生物组的多样性、稳定性和弹性。

Nature. 2012 Sep 13;489(7415):220-30. doi: 10.1038/nature11550.

Transfer of intestinal microbiota from lean donors increases insulin sensitivity in individuals with metabolic syndrome.瘦素供体的肠道微生物群转移可增加代谢综合征个体的胰岛素敏感性。

Gastroenterology. 2012 Oct;143(4):913-6.e7. doi: 10.1053/j.gastro.2012.06.031. Epub 2012 Jun 20.

A framework for human microbiome research.人类微生物组研究框架。

Nature. 2012 Jun 13;486(7402):215-21. doi: 10.1038/nature11209.

Empirical Performance of Cross-Validation With Oracle Methods in a Genomics Context.基因组学背景下与神谕方法结合的交叉验证的实证表现

Am Stat. 2011 Nov 1;65(4):223-228. doi: 10.1198/tas.2011.11052.

A high calcium diet containing nonfat dry milk reduces weight gain and associated adipose tissue inflammation in diet-induced obese mice when compared to high calcium alone.与单纯高钙饮食相比，含脱脂奶粉的高钙饮食可减少饮食诱导肥胖小鼠的体重增加和相关脂肪组织炎症。

Nutr Metab (Lond). 2012 Jan 23;9(1):3. doi: 10.1186/1743-7075-9-3.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验