Hao Xiaojuan, Eskridge Kent M, Wang Dong
Department of Statistics, University of Nebraska, Lincoln, NE, USA.
Division of Bioinformatics and Biostatistics, FDA National Center for Toxicological Research, Jefferson, AR, USA.
J Appl Stat. 2020 Nov 30;49(5):1140-1153. doi: 10.1080/02664763.2020.1854200. eCollection 2022.
With the advance of next generation sequencing technologies, researchers now routinely obtain a collection of microbial sequences with complex phylogenetic relationships. It is often of interest to analyze the association between certain environmental factors and characteristics of the microbial collection. Though methods have been developed to test for association between the microbial composition with environmental factors as well as between coevolving traits, a flexible model that can provide a comprehensive picture of the relationship between microbial community characteristics and environmental variables will be tremendously beneficial. We developed a Bayesian approach for association analysis while incorporating the phylogenetic structure to account for the dependence between observations. To overcome the computational difficulty related to the phylogenetic tree, a variational algorithm was developed to evaluate the posterior distribution. As the posterior distribution can be readily obtained for parameters of interest and any derived variables, the association relationship can be examined comprehensively. With two application examples, we demonstrated that the Bayesian approach can uncover nuanced details of the microbial assemblage with regard to the environmental factor. The proposed Bayesian approach and variational algorithm can be extended for other problems involving dependence over tree-like structures.
随着下一代测序技术的发展,研究人员现在经常获得一组具有复杂系统发育关系的微生物序列。分析某些环境因素与微生物集合特征之间的关联通常很有意义。尽管已经开发出方法来测试微生物组成与环境因素之间以及共同进化特征之间的关联,但一个能够全面描述微生物群落特征与环境变量之间关系的灵活模型将非常有益。我们开发了一种贝叶斯关联分析方法,同时纳入系统发育结构以考虑观测值之间的依赖性。为了克服与系统发育树相关的计算困难,开发了一种变分算法来评估后验分布。由于可以很容易地获得感兴趣参数和任何派生变量的后验分布,因此可以全面检查关联关系。通过两个应用示例,我们证明了贝叶斯方法可以揭示微生物群落关于环境因素的细微细节。所提出的贝叶斯方法和变分算法可以扩展到其他涉及树状结构依赖性的问题。