Zhao Ni, Zhan Xiang, Guthrie Katherine A, Mitchell Caroline M, Larson Joseph
Departments of Biostatistics, Johns Hopkins University, Baltimore, Maryland, United States of America.
Department of Public Health Sciences, Pennsylvania State University, Hershey, Pennsylvania, United States of America.
Genet Epidemiol. 2018 Jul;42(5):459-469. doi: 10.1002/gepi.22127. Epub 2018 May 7.
The human microbiome is a dynamic system that changes due to diseases, medication, change in diet, etc. The paired design is a common approach to evaluate the microbial changes while controlling for the inherent differences between people. For example, microbiome data may be collected from the same individuals before and after a treatment. Two challenges exist in analyzing this type of data. First, microbiome data are compositional such that the reads for all taxa in each sample are constrained to sum to a constant. Second, the number of taxa can be much larger than the sample size. Few statistical methods exist to analyze such data besides methods that test one taxon at a time. In this paper, we propose to first conduct a log-ratio transformation of the compositions, and then develop a generalized Hotelling's test (GHT) to evaluate whether the average microbiome compositions are equivalent in the paired samples. We replace the sample covariance matrix in standard Hotelling's statistic by a shrinkage-based covariance, calculated as a weighted average of the sample covariance and a positive definite target matrix. The optimal weighting can be obtained for many commonly used target matrices. We develop a permutation procedure to assess the statistical significance. Extensive simulations show that our proposed method has well-controlled type I error and better power than a few ad hoc approaches. We apply our method to examine the vaginal microbiome changes in response to treatments for menopausal hot flashes. An R package " GHT" is freely available at https://github.com/zhaoni153/GHT.
人类微生物组是一个动态系统,会因疾病、药物治疗、饮食变化等因素而改变。配对设计是一种常见的方法,用于在控制个体间固有差异的同时评估微生物变化。例如,可以在治疗前后从同一个体收集微生物组数据。分析这类数据存在两个挑战。首先,微生物组数据具有组合性,即每个样本中所有分类群的读数总和被限制为一个常数。其次,分类群的数量可能远大于样本量。除了一次只测试一个分类群的方法外,几乎没有统计方法可用于分析此类数据。在本文中,我们建议首先对成分进行对数比变换,然后开发一种广义霍特林检验(GHT),以评估配对样本中的平均微生物组成分是否等效。我们用基于收缩的协方差代替标准霍特林统计量中的样本协方差矩阵,该协方差计算为样本协方差和一个正定目标矩阵的加权平均值。对于许多常用的目标矩阵,可以获得最优权重。我们开发了一种置换程序来评估统计显著性。广泛的模拟表明,我们提出的方法具有良好控制的I型错误,并且比一些临时方法具有更好的检验功效。我们应用我们的方法来研究绝经潮热治疗后阴道微生物组的变化。一个名为“GHT”的R包可在https://github.com/zhaoni153/GHT上免费获取。