Touloumi G, Samoli E, Pipikou M, Le Tertre A, Atkinson R, Katsouyanni K
Department of Hygiene and Epidemiology, Athens Medical School, Athens, Greece.
Stat Med. 2006 Dec 30;25(24):4164-78. doi: 10.1002/sim.2681.
A major statistical challenge in air pollution and health time-series studies is to adequately control for confounding effects of time-varying covariates. Daily health outcome counts are most commonly analysed by Poisson regression models, adjusted for overdispersion, with air pollution levels included as a linear predictor and smooth functions for calendar time and weather variables to adjust for time-varying confounders. Various smoothers have been used so far, but the optimal strategy for choosing smoothers and their degree of smoothing remains controversial. In this work, we evaluate the performance of various smoothers with different criteria for choosing the degree of smoothing in terms of bias and efficiency of the air pollution effect estimate in a simulation study. The evaluated approaches were also applied to real mortality data from 22 European cities. The simulation study imitated a multi-city study. Data were generated from a fully parametric model. Model selection methods which optimize prediction may lead to increased biases in the air pollution effect estimate. Minimization of the absolute value of the sum of the partial autocorrelation function of the model's residuals (PACF), as a criterion to choose the degree of smoothness, gave the smallest biases. The penalized splines (PS) method with a large number of effective dfs (e.g. 8-12 per year) could be used as the basic, relatively conservative, analysis whereas the PS and natural splines in combination with PACF could be applied to provide a reasonable range of the effect estimate.
空气污染与健康时间序列研究中的一个主要统计挑战是充分控制随时间变化的协变量的混杂效应。每日健康结果计数最常通过泊松回归模型进行分析,并针对过度分散进行调整,将空气污染水平作为线性预测变量纳入,并使用日历时间和天气变量的平滑函数来调整随时间变化的混杂因素。到目前为止,已经使用了各种平滑方法,但选择平滑方法及其平滑程度的最佳策略仍存在争议。在这项工作中,我们在一项模拟研究中,根据空气污染效应估计的偏差和效率,用不同的平滑程度选择标准评估了各种平滑方法的性能。所评估的方法也应用于来自22个欧洲城市的实际死亡率数据。模拟研究模仿了一项多城市研究。数据由一个完全参数化模型生成。优化预测的模型选择方法可能会导致空气污染效应估计中的偏差增加。将模型残差的偏自相关函数(PACF)之和的绝对值最小化作为选择平滑程度的标准,产生的偏差最小。具有大量有效自由度(例如每年8 - 12个)的惩罚样条(PS)方法可以用作基本的、相对保守的分析方法,而PS和自然样条与PACF相结合可用于提供合理的效应估计范围。