Department of Data Analysis and Mathematical Modelling, Ghent University, Ghent, Belgium.
Center for Microbial Ecology and Technology, Ghent University, Ghent, Belgium.
PLoS One. 2019 Feb 13;14(2):e0205474. doi: 10.1371/journal.pone.0205474. eCollection 2019.
Explorative visualization techniques provide a first summary of microbiome read count datasets through dimension reduction. A plethora of dimension reduction methods exists, but many of them focus primarily on sample ordination, failing to elucidate the role of the bacterial species. Moreover, implicit but often unrealistic assumptions underlying these methods fail to account for overdispersion and differences in sequencing depth, which are two typical characteristics of sequencing data. We combine log-linear models with a dispersion estimation algorithm and flexible response function modelling into a framework for unconstrained and constrained ordination. The method is able to cope with differences in dispersion between taxa and varying sequencing depths, to yield meaningful biological patterns. Moreover, it can correct for observed technical confounders, whereas other methods are adversely affected by these artefacts. Unlike distance-based ordination methods, the assumptions underlying our method are stated explicitly and can be verified using simple diagnostics. The combination of unconstrained and constrained ordination in the same framework is unique in the field and facilitates microbiome data exploration. We illustrate the advantages of our method on simulated and real datasets, while pointing out flaws in existing methods. The algorithms for fitting and plotting are available in the R-package RCM.
探索性可视化技术通过降维为微生物组读数数据集提供了初步总结。存在大量的降维方法,但其中许多方法主要侧重于样本排序,未能阐明细菌种类的作用。此外,这些方法所隐含的但往往不切实际的假设未能考虑过度分散和测序深度的差异,这是测序数据的两个典型特征。我们将对数线性模型与分散估计算法和灵活的响应函数建模结合到一个无约束和约束排序的框架中。该方法能够处理分类群之间的分散差异和不同的测序深度,从而产生有意义的生物学模式。此外,它可以纠正观察到的技术混淆因素,而其他方法则受到这些伪影的不利影响。与基于距离的排序方法不同,我们方法的假设是明确陈述的,可以使用简单的诊断进行验证。在同一个框架中结合无约束和约束排序在该领域是独特的,有助于微生物组数据探索。我们在模拟和真实数据集上说明了我们方法的优势,同时指出了现有方法的缺陷。拟合和绘图的算法可在 R 包 RCM 中获得。