MRC Biostatistics Unit, Institute of Public Health, Cambridge, UK.
Int J Epidemiol. 2010 Oct;39(5):1345-59. doi: 10.1093/ije/dyq063. Epub 2010 May 3.
Meta-analysis of individual participant time-to-event data from multiple prospective epidemiological studies enables detailed investigation of exposure-risk relationships, but involves a number of analytical challenges.
This article describes statistical approaches adopted in the Emerging Risk Factors Collaboration, in which primary data from more than 1 million participants in more than 100 prospective studies have been collated to enable detailed analyses of various risk markers in relation to incident cardiovascular disease outcomes.
Analyses have been principally based on Cox proportional hazards regression models stratified by sex, undertaken in each study separately. Estimates of exposure-risk relationships, initially unadjusted and then adjusted for several confounders, have been combined over studies using meta-analysis. Methods for assessing the shape of exposure-risk associations and the proportional hazards assumption have been developed. Estimates of interactions have also been combined using meta-analysis, keeping separate within- and between-study information. Regression dilution bias caused by measurement error and within-person variation in exposures and confounders has been addressed through the analysis of repeat measurements to estimate corrected regression coefficients. These methods are exemplified by analysis of plasma fibrinogen and risk of coronary heart disease, and Stata code is made available.
Increasing numbers of meta-analyses of individual participant data from observational data are being conducted to enhance the statistical power and detail of epidemiological studies. The statistical methods developed here can be used to address the needs of such analyses.
从多个前瞻性流行病学研究的个体参与者时间事件数据进行荟萃分析,能够详细研究暴露风险关系,但涉及到许多分析挑战。
本文描述了新兴风险因素协作组采用的统计方法,该协作组汇总了超过 100 项前瞻性研究中超过 100 万参与者的原始数据,以能够详细分析各种风险标志物与心血管疾病事件的关系。
分析主要基于按性别分层的 Cox 比例风险回归模型,在每个研究中单独进行。最初未经调整,然后根据几个混杂因素进行调整的暴露风险关系估计值,使用荟萃分析在研究之间进行了组合。还开发了用于评估暴露风险关联形状和比例风险假设的方法。使用荟萃分析也对相互作用的估计值进行了组合,同时保留了研究内和研究间的信息。通过分析重复测量来估计校正回归系数,解决了由于测量误差和暴露及混杂因素在个体内的变化引起的回归稀释偏倚。通过分析血浆纤维蛋白原与冠心病风险的关系来说明这些方法,并提供了 Stata 代码。
越来越多的对观察性数据的个体参与者数据的荟萃分析正在进行,以增强流行病学研究的统计效力和详细程度。这里开发的统计方法可以用于满足这些分析的需求。