Department of Medical Statistics, London School of Hygiene and Tropical Medicine, Keppel Street, London, United Kingdom.
Laboratory of Experimental Cardiology, University Medical Center Utrecht, Utrecht University, Utrecht, Netherlands.
BMC Bioinformatics. 2023 May 22;24(1):210. doi: 10.1186/s12859-023-05219-x.
The microbiome plays a key role in the health of the human body. Interest often lies in finding features of the microbiome, alongside other covariates, which are associated with a phenotype of interest. One important property of microbiome data, which is often overlooked, is its compositionality as it can only provide information about the relative abundance of its constituting components. Typically, these proportions vary by several orders of magnitude in datasets of high dimensions. To address these challenges we develop a Bayesian hierarchical linear log-contrast model which is estimated by mean field Monte-Carlo co-ordinate ascent variational inference (CAVI-MC) and easily scales to high dimensional data. We use novel priors which account for the large differences in scale and constrained parameter space associated with the compositional covariates. A reversible jump Monte Carlo Markov chain guided by the data through univariate approximations of the variational posterior probability of inclusion, with proposal parameters informed by approximating variational densities via auxiliary parameters, is used to estimate intractable marginal expectations. We demonstrate that our proposed Bayesian method performs favourably against existing frequentist state of the art compositional data analysis methods. We then apply the CAVI-MC to the analysis of real data exploring the relationship of the gut microbiome to body mass index.
微生物组在人体健康中起着关键作用。人们通常感兴趣的是找到微生物组的特征,以及其他协变量,这些特征与感兴趣的表型相关。微生物组数据的一个重要特性(通常被忽视)是其组合性,因为它只能提供其组成成分相对丰度的信息。在高维数据集通常比例变化幅度很大,可能相差几个数量级。为了解决这些挑战,我们开发了一种贝叶斯层次线性对数对比模型,该模型通过平均场蒙特卡罗坐标上升变分推理(CAVI-MC)进行估计,并且很容易扩展到高维数据。我们使用了新的先验,这些先验考虑了与组合协变量相关的大尺度差异和约束参数空间。通过对变分后验概率的单变量近似,通过辅助参数来近似变分密度,使用可逆转跳马尔可夫链引导数据,来估计难以处理的边缘期望。我们证明了我们提出的贝叶斯方法在处理实际数据时表现良好,探索了肠道微生物组与体重指数的关系。