Bind M-A, VanderWeele T J, Schwartz J D, Coull B A
Department of Statistics, Harvard University, Cambridge, MA, U.S.A.
Epidemiology, Harvard School of Public Health, Boston, MA, U.S.A.
Stat Med. 2017 Nov 20;36(26):4182-4195. doi: 10.1002/sim.7423. Epub 2017 Aug 7.
Mediation analysis has mostly been conducted with mean regression models. With this approach modeling means, formulae for direct and indirect effects are based on changes in means, which may not capture effects that occur in units at the tails of mediator and outcome distributions. Individuals with extreme values of medical endpoints are often more susceptible to disease and can be missed if one investigates mean changes only. We derive the controlled direct and indirect effects of an exposure along percentiles of the mediator and outcome using quantile regression models and a causal framework. The quantile regression models can accommodate an exposure-mediator interaction and random intercepts to allow for longitudinal mediator and outcome. Because DNA methylation acts as a complex "switch" to control gene expression and fibrinogen is a cardiovascular factor, individuals with extreme levels of these markers may be more susceptible to air pollution. We therefore apply this methodology to environmental data to estimate the effect of air pollution, as measured by particle number, on fibrinogen levels through a change in interferon-gamma (IFN-γ) methylation. We estimate the controlled direct effect of air pollution on the qth percentile of fibrinogen and its indirect effect through a change in the pth percentile of IFN-γ methylation. We found evidence of a direct effect of particle number on the upper tail of the fibrinogen distribution. We observed a suggestive indirect effect of particle number on the upper tail of the fibrinogen distribution through a change in the lower percentiles of the IFN-γ methylation distribution.
中介分析大多是使用均值回归模型进行的。采用这种均值建模方法,直接效应和间接效应的公式是基于均值的变化,而这可能无法捕捉到在中介变量和结果分布尾部的个体中出现的效应。具有医学终点极值的个体通常更容易患病,如果只研究均值变化,这些个体可能会被遗漏。我们使用分位数回归模型和因果框架,沿着中介变量和结果的百分位数推导暴露的受控直接效应和间接效应。分位数回归模型可以纳入暴露 - 中介变量相互作用和随机截距,以考虑纵向的中介变量和结果。由于DNA甲基化作为一个复杂的“开关”来控制基因表达,而纤维蛋白原是一种心血管因素,这些标志物水平极高的个体可能更容易受到空气污染的影响。因此,我们将这种方法应用于环境数据,以估计通过干扰素 - γ(IFN - γ)甲基化变化来衡量的空气污染对纤维蛋白原水平的影响。我们估计空气污染对纤维蛋白原第q百分位数的受控直接效应及其通过IFN - γ甲基化第p百分位数变化产生的间接效应。我们发现了颗粒物数量对纤维蛋白原分布上尾有直接效应的证据。我们观察到颗粒物数量通过IFN - γ甲基化分布较低百分位数的变化,对纤维蛋白原分布上尾有提示性的间接效应。