用于微生物组数据分析的零膨胀贝塔二项式模型。

A Zero-inflated Beta-binomial Model for Microbiome Data Analysis.

作者信息

Hu Tao, Gallins Paul, Zhou Yi-Hui

机构信息

Bioinformatics Research Center, North Carolina State University, NC, 27695.

Department of Biological Sciences and Bioinformatics Research Center, North Carolina State University, NC, 27695.

出版信息

Stat (Int Stat Inst). 2018;7(1). doi: 10.1002/sta4.185. Epub 2018 Jun 19.

DOI:10.1002/sta4.185

PMID:30197785

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6124506/

Abstract

The microbiome is increasingly recognized as an important aspect of the health of host species, involved in many biological pathways and processes and potentially useful as health biomarkers. Taking advantage of high-throughput sequencing technologies, modern bacterial microbiome studies are metagenomic, interrogating thousands of taxa simultaneously. Several data analysis frameworks have been proposed for microbiome sequence read count data and determining the most significant features. However, there is still room for improvement. We introduce a zero-inflated beta-binomial (ZIBB) to model the distribution of microbiome count data and to determine association with a continuous or categorical phenotype of interest. The approach can exploit mean-variance relationships to improve power and adjust for covariates. The proposed method is a mixture model with two components: (i) a zero model accounting for excess zeros and (ii) a count model to capture the remaining component by beta-binomial regression, allowing for overdispersion effects. Simulation studies show that our proposed method effectively controls type I error and has higher power than competing methods to detect taxa associated with phenotype. An R package ZIBBSeqDiscovery is available on R CRAN.

摘要

微生物组越来越被认为是宿主物种健康的一个重要方面，它参与许多生物途径和过程，并且有可能作为健康生物标志物。利用高通量测序技术，现代细菌微生物组研究是宏基因组学的，可同时对数千个分类单元进行分析。已经提出了几种用于微生物组序列读取计数数据和确定最显著特征的数据分析框架。然而，仍有改进的空间。我们引入零膨胀β-二项式（ZIBB）来对微生物组计数数据的分布进行建模，并确定与感兴趣的连续或分类表型的关联。该方法可以利用均值-方差关系来提高功效并对协变量进行调整。所提出的方法是一个具有两个成分的混合模型：（i）一个用于解释过多零值的零模型，以及（ii）一个通过β-二项式回归来捕获其余成分的计数模型，允许存在过度离散效应。模拟研究表明，我们提出的方法有效地控制了I型错误，并且在检测与表型相关的分类单元方面比竞争方法具有更高的功效。R包ZIBBSeqDiscovery可在R CRAN上获取。

相似文献

A Zero-inflated Beta-binomial Model for Microbiome Data Analysis.

Stat (Int Stat Inst). 2018;7(1). doi: 10.1002/sta4.185. Epub 2018 Jun 19.

Marginalized multilevel hurdle and zero-inflated models for overdispersed and correlated count data with excess zeros.

Stat Med. 2014 Nov 10;33(25):4402-19. doi: 10.1002/sim.6237. Epub 2014 Jun 23.

Negative binomial mixed models for analyzing microbiome count data.

BMC Bioinformatics. 2017 Jan 3;18(1):4. doi: 10.1186/s12859-016-1441-7.

Analyzing the overall effects of the microbiome abundance data with a Bayesian predictive value approach.

Stat Methods Med Res. 2022 Oct;31(10):1992-2003. doi: 10.1177/09622802221107106. Epub 2022 Jun 12.

Bayesian variable selection for multivariate zero-inflated models: Application to microbiome count data.

Biostatistics. 2020 Jul 1;21(3):499-517. doi: 10.1093/biostatistics/kxy067.

On performance of parametric and distribution-free models for zero-inflated and over-dispersed count responses.

Stat Med. 2015 Oct 30;34(24):3235-45. doi: 10.1002/sim.6560. Epub 2015 Jun 15.

EM Adaptive LASSO-A Multilocus Modeling Strategy for Detecting SNPs Associated with Zero-inflated Count Phenotypes.

Front Genet. 2016 Mar 30;7:32. doi: 10.3389/fgene.2016.00032. eCollection 2016.

NBZIMM: negative binomial and zero-inflated mixed models, with application to microbiome/metagenomics data analysis.

BMC Bioinformatics. 2020 Oct 30;21(1):488. doi: 10.1186/s12859-020-03803-z.

Assessment and Selection of Competing Models for Zero-Inflated Microbiome Data.

PLoS One. 2015 Jul 6;10(7):e0129606. doi: 10.1371/journal.pone.0129606. eCollection 2015.

A Marginalized Zero-Inflated Negative Binomial Model for Spatial Data: Modeling COVID-19 Deaths in Georgia.

Biom J. 2024 Jul;66(5):e202300182. doi: 10.1002/bimj.202300182.

引用本文的文献

Wise Roles and Future Visionary Endeavors of Current Emperor: Advancing Dynamic Methods for Longitudinal Microbiome Meta-Omics Data in Personalized and Precision Medicine.

Adv Sci (Weinh). 2024 Dec;11(47):e2400458. doi: 10.1002/advs.202400458. Epub 2024 Nov 13.

A comparative analysis of mutual information methods for pairwise relationship detection in metagenomic data.

BMC Bioinformatics. 2024 Aug 14;25(1):266. doi: 10.1186/s12859-024-05883-7.

Modeling County-Level Rare Disease Prevalence Using Bayesian Hierarchical Sampling Weighted Zero-Inflated Regression.

J Data Sci. 2023 Jan;21(1):145-157. doi: 10.6339/22-JDS1049.

Machine learning and deep learning applications in microbiome research.

ISME Commun. 2022 Oct 6;2(1):98. doi: 10.1038/s43705-022-00182-9.

Statistical Analysis of Multiplex Immunofluorescence and Immunohistochemistry Imaging Data.

Methods Mol Biol. 2023;2629:141-168. doi: 10.1007/978-1-0716-2986-4_8.

Leveraging Scheme for Cross-Study Microbiome Machine Learning Prediction and Feature Evaluations.

Bioengineering (Basel). 2023 Feb 8;10(2):231. doi: 10.3390/bioengineering10020231.

Pairwise ratio-based differential abundance analysis of infant microbiome 16S sequencing data.

NAR Genom Bioinform. 2023 Jan 20;5(1):lqad001. doi: 10.1093/nargab/lqad001. eCollection 2023 Mar.

Accommodating multiple potential normalizations in microbiome associations studies.

BMC Bioinformatics. 2023 Jan 19;24(1):22. doi: 10.1186/s12859-023-05147-w.

Compositionality, sparsity, spurious heterogeneity, and other data-driven challenges for machine learning algorithms within plant microbiome studies.

Curr Opin Plant Biol. 2023 Feb;71:102326. doi: 10.1016/j.pbi.2022.102326. Epub 2022 Dec 18.

Improve the Colorectal Cancer Diagnosis Using Gut Microbiome Data.

Front Mol Biosci. 2022 Aug 12;9:921945. doi: 10.3389/fmolb.2022.921945. eCollection 2022.

本文引用的文献

Assessment and Selection of Competing Models for Zero-Inflated Microbiome Data.

PLoS One. 2015 Jul 6;10(7):e0129606. doi: 10.1371/journal.pone.0129606. eCollection 2015.

Testing in Microbiome-Profiling Studies with MiRKAT, the Microbiome Regression-Based Kernel Association Test.

Am J Hum Genet. 2015 May 7;96(5):797-807. doi: 10.1016/j.ajhg.2015.04.003.

Hypothesis testing at the extremes: fast and robust association for high-throughput data.

Biostatistics. 2015 Jul;16(3):611-25. doi: 10.1093/biostatistics/kxv007. Epub 2015 Mar 18.

Waste not, want not: why rarefying microbiome data is inadmissible.

PLoS Comput Biol. 2014 Apr 3;10(4):e1003531. doi: 10.1371/journal.pcbi.1003531. eCollection 2014 Apr.

Daily temporal dynamics of vaginal microbiota before, during and after episodes of bacterial vaginosis.

Microbiome. 2013 Dec 2;1(1):29. doi: 10.1186/2049-2618-1-29.

Comparative meta-RNA-seq of the vaginal microbiota and differential expression by Lactobacillus iners in health and dysbiosis.

Microbiome. 2013 Apr 12;1(1):12. doi: 10.1186/2049-2618-1-12.

A logistic normal multinomial regression model for microbiome compositional data analysis.

Biometrics. 2013 Dec;69(4):1053-63. doi: 10.1111/biom.12079. Epub 2013 Oct 15.

Differential abundance analysis for microbial marker-gene surveys.

Nat Methods. 2013 Dec;10(12):1200-2. doi: 10.1038/nmeth.2658. Epub 2013 Sep 29.

A metagenome-wide association study of gut microbiota in type 2 diabetes.

Nature. 2012 Oct 4;490(7418):55-60. doi: 10.1038/nature11450. Epub 2012 Sep 26.

Associating microbiome composition with environmental covariates using generalized UniFrac distances.

Bioinformatics. 2012 Aug 15;28(16):2106-13. doi: 10.1093/bioinformatics/bts342. Epub 2012 Jun 17.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

用于微生物组数据分析的零膨胀贝塔二项式模型。

A Zero-inflated Beta-binomial Model for Microbiome Data Analysis.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献