Division of General Medical Sciences, Department of Medicine, Washington University School of Medicine, St. Louis, MO, USA.
PLoS One. 2012;7(12):e52078. doi: 10.1371/journal.pone.0052078. Epub 2012 Dec 20.
This paper presents new biostatistical methods for the analysis of microbiome data based on a fully parametric approach using all the data. The Dirichlet-multinomial distribution allows the analyst to calculate power and sample sizes for experimental design, perform tests of hypotheses (e.g., compare microbiomes across groups), and to estimate parameters describing microbiome properties. The use of a fully parametric model for these data has the benefit over alternative non-parametric approaches such as bootstrapping and permutation testing, in that this model is able to retain more information contained in the data. This paper details the statistical approaches for several tests of hypothesis and power/sample size calculations, and applies them for illustration to taxonomic abundance distribution and rank abundance distribution data using HMP Jumpstart data on 24 subjects for saliva, subgingival, and supragingival samples. Software for running these analyses is available.
本文提出了基于完全参数方法的新的微生物组数据分析的生物统计学方法,该方法使用了所有数据。Dirichlet-multinomial 分布允许分析人员计算实验设计的功效和样本量,进行假设检验(例如,比较不同组之间的微生物组),并估计描述微生物组特性的参数。与替代的非参数方法(如自举法和置换检验)相比,对这些数据使用完全参数模型具有以下优势,即该模型能够保留数据中包含的更多信息。本文详细介绍了几种假设检验和功效/样本量计算的统计方法,并应用于使用 24 个个体的 HMP Jumpstart 数据对唾液、龈下和龈上样本的分类丰度分布和等级丰度分布数据的说明。运行这些分析的软件可提供。