Dept. of Hygiene, Epidemiology and Medical Statistics, Medical School, National and Kapodistrian University of Athens, 75 Mikras Asias Str, 115 27 Athens, Greece.
Danish Cancer Society Research Centre, Copenhagen, Denmark.
Environ Int. 2021 Feb;147:106371. doi: 10.1016/j.envint.2020.106371. Epub 2021 Jan 12.
We evaluated methods for the analysis of multi-level survival data using a pooled dataset of 14 cohorts participating in the ELAPSE project investigating associations between residential exposure to low levels of air pollution (PM and NO) and health (natural-cause mortality and cerebrovascular, coronary and lung cancer incidence).
We applied five approaches in a multivariable Cox model to account for the first level of clustering corresponding to cohort specification: (1) not accounting for the cohort or using (2) indicator variables, (3) strata, (4) a frailty term in frailty Cox models, (5) a random intercept under a mixed Cox, for cohort identification. We accounted for the second level of clustering due to common characteristics in the residential area by (1) a random intercept per small area or (2) applying variance correction. We assessed the stratified, frailty and mixed Cox approach through simulations under different scenarios for heterogeneity in the underlying hazards and the air pollution effects.
Effect estimates were stable under approaches used to adjust for cohort but substantially differed when no adjustment was applied. Further adjustment for the small area grouping increased the effect estimates' standard errors. Simulations confirmed identical results between the stratified and frailty models. In ELAPSE we selected a stratified multivariable Cox model to account for between-cohort heterogeneity without adjustment for small area level, due to the small number of subjects and events in the latter.
Our study supports the need to account for between-cohort heterogeneity in multi-center collaborations using pooled individual level data.
我们使用参与 ELAPSE 项目的 14 个队列的汇总数据集评估了分析多层次生存数据的方法,该项目旨在研究居住环境中低水平空气污染(PM 和 NO)与健康(自然原因死亡率和脑血管、冠状动脉和肺癌发病率)之间的关联。
我们在多变量 Cox 模型中应用了五种方法来考虑与队列规范相对应的第一级聚类:(1)不考虑队列或使用(2)指示变量,(3)层,(4)脆弱 Cox 模型中的脆弱性项,(5)混合 Cox 下的随机截距,用于队列识别。我们通过(1)每个小区域的随机截距或(2)应用方差校正来考虑由于居住区域的共同特征而导致的第二级聚类。我们通过在不同的底层风险和空气污染效应异质性情景下进行模拟,评估了分层、脆弱性和混合 Cox 方法。
在用于调整队列的方法下,效应估计是稳定的,但在未进行调整时,效应估计值会有很大差异。进一步调整小区域分组会增加效应估计值的标准误差。模拟结果证实了分层和脆弱性模型之间的相同结果。在 ELAPSE 中,由于后者的受试者和事件数量较少,我们选择了分层多变量 Cox 模型来调整队列之间的异质性,而不调整小区域水平。
我们的研究支持在使用汇总个体水平数据的多中心合作中,需要考虑队列之间的异质性。