Greven Sonja, Dominici Francesca, Zeger Scott
Emmy Noether Junior Research Group Leader, Department of Statistics, Ludwig-Maximilians-Universität München, 80539 Munich, Germany.
Professor, Department of Biostatistics, Harvard University, Boston, MA 02115.
J Am Stat Assoc. 2011;106(494):396-406. doi: 10.1198/jasa.2011.ap09392. Epub 2012 Jan 24.
There is substantial observational evidence that long-term exposure to particulate air pollution is associated with premature death in urban populations. Estimates of the magnitude of these effects derive largely from cross-sectional comparisons of adjusted mortality rates among cities with varying pollution levels. Such estimates are potentially confounded by other differences among the populations correlated with air pollution, for example, socioeconomic factors. An alternative approach is to study covariation of particulate matter and mortality across time within a city, as has been done in investigations of short-term exposures. In either event, observational studies like these are subject to confounding by unmeasured variables. Therefore the ability to detect such confounding and to derive estimates less affected by confounding are a high priority. In this article, we describe and apply a method of decomposing the exposure variable into components with variation at distinct temporal, spatial, and time by space scales, here focusing on the components involving time. Starting from a proportional hazard model, we derive a Poisson regression model and estimate two regression coefficients: the "global" coefficient that measures the association between national trends in pollution and mortality; and the "local" coefficient, derived from space by time variation, that measures the association between location-specific trends in pollution and mortality adjusted by the national trends. Absent unmeasured confounders and given valid model assumptions, the scale-specific coefficients should be similar; substantial differences in these coefficients constitute a basis for questioning the model. We derive a backfitting algorithm to fit our model to very large spatio-temporal datasets. We apply our methods to the Medicare Cohort Air Pollution Study (MCAPS), which includes individual-level information on time of death and age on a population of 18.2 million for the period 2000-2006. Results based on the global coefficient indicate a large increase in the national life expectancy for reductions in the yearly national average of PM. However, this coefficient based on national trends in PM and mortality is likely to be confounded by other variables trending on the national level. Confounding of the local coefficient by unmeasured factors is less likely, although it cannot be ruled out. Based on the local coefficient alone, we are not able to demonstrate any change in life expectancy for a reduction in PM. We use additional survey data available for a subset of the data to investigate sensitivity of results to the inclusion of additional covariates, but both coefficients remain largely unchanged.
有大量观察性证据表明,城市人口长期暴露于颗粒物空气污染与过早死亡有关。这些影响程度的估计主要来自对不同污染水平城市间调整后死亡率的横断面比较。此类估计可能会因与空气污染相关的人群中的其他差异(例如社会经济因素)而产生混淆。另一种方法是研究城市内部颗粒物与死亡率随时间的协变关系,就像在短期暴露研究中所做的那样。无论哪种情况,像这样的观察性研究都容易受到未测量变量的混淆。因此,检测这种混淆并得出受混淆影响较小的估计值的能力是当务之急。在本文中,我们描述并应用一种方法,将暴露变量分解为在不同时间、空间以及时空尺度上具有变化的成分,这里重点关注涉及时间的成分。从比例风险模型出发,我们推导出一个泊松回归模型,并估计两个回归系数:“全局”系数,用于衡量全国污染趋势与死亡率之间的关联;以及“局部”系数,由时空变化得出,用于衡量特定地点污染趋势与经全国趋势调整后的死亡率之间的关联。在没有未测量的混杂因素且模型假设有效的情况下,特定尺度的系数应该相似;这些系数的显著差异构成了质疑该模型的依据。我们推导出一种反向拟合算法,以使我们的模型适用于非常大的时空数据集。我们将我们的方法应用于医疗保险队列空气污染研究(MCAPS),该研究包括2000 - 2006年期间1820万人口的死亡时间和年龄的个体层面信息。基于全局系数的结果表明,全国年均PM值降低会使全国预期寿命大幅增加。然而,基于全国PM和死亡率趋势的这个系数很可能会受到全国层面上其他趋势变量的混淆。尽管不能排除未测量因素对局部系数的混淆,但这种混淆的可能性较小。仅基于局部系数,我们无法证明PM降低会使预期寿命发生任何变化。我们使用可用于部分数据子集的额外调查数据来研究结果对纳入额外协变量的敏感性,但两个系数基本保持不变。