Division of Biostatistics, Department of Population Health, New York University School of Medicine, New York, NY 10016, USA.
Department of Medicine and Microbiology, Center for Advanced Biotechnology and Medicine, Rutgers University, Piscataway, NJ 08854-8021, USA.
Bioinformatics. 2020 Jan 15;36(2):347-355. doi: 10.1093/bioinformatics/btz565.
Recent microbiome association studies have revealed important associations between microbiome and disease/health status. Such findings encourage scientists to dive deeper to uncover the causal role of microbiome in the underlying biological mechanism, and have led to applying statistical models to quantify causal microbiome effects and to identify the specific microbial agents. However, there are no existing causal mediation methods specifically designed to handle high dimensional and compositional microbiome data.
We propose a rigorous Sparse Microbial Causal Mediation Model (SparseMCMM) specifically designed for the high dimensional and compositional microbiome data in a typical three-factor (treatment, microbiome and outcome) causal study design. In particular, linear log-contrast regression model and Dirichlet regression model are proposed to estimate the causal direct effect of treatment and the causal mediation effects of microbiome at both the community and individual taxon levels. Regularization techniques are used to perform the variable selection in the proposed model framework to identify signature causal microbes. Two hypothesis tests on the overall mediation effect are proposed and their statistical significance is estimated by permutation procedures. Extensive simulated scenarios show that SparseMCMM has excellent performance in estimation and hypothesis testing. Finally, we showcase the utility of the proposed SparseMCMM method in a study which the murine microbiome has been manipulated by providing a clear and sensible causal path among antibiotic treatment, microbiome composition and mouse weight.
https://sites.google.com/site/huilinli09/software and https://github.com/chanw0/SparseMCMM.
Supplementary data are available at Bioinformatics online.
最近的微生物组关联研究揭示了微生物组与疾病/健康状况之间的重要关联。这些发现鼓励科学家深入研究,揭示微生物组在潜在生物学机制中的因果作用,并导致应用统计模型来量化因果微生物组效应,并确定特定的微生物剂。然而,目前还没有专门设计用于处理高维和组成型微生物组数据的因果中介方法。
我们提出了一种严格的稀疏微生物因果中介模型(SparseMCMM),专门针对典型三因素(处理、微生物组和结果)因果研究设计中的高维和组成型微生物组数据。特别是,提出了线性对数对比回归模型和狄利克雷回归模型来估计处理的因果直接效应和微生物组在群落和个体分类群水平上的因果中介效应。正则化技术用于在提出的模型框架中进行变量选择,以识别特征性的因果微生物。提出了两个关于总体中介效应的假设检验,并通过置换程序估计其统计显著性。广泛的模拟情景表明,SparseMCMM 在估计和假设检验方面具有优异的性能。最后,我们展示了所提出的 SparseMCMM 方法在一项研究中的实用性,该研究通过提供抗生素治疗、微生物组组成和小鼠体重之间清晰而合理的因果路径,来操纵小鼠微生物组。
https://sites.google.com/site/huilinli09/software 和 https://github.com/chanw0/SparseMCMM。
补充数据可在生物信息学在线获得。