Decker Anna L, Hubbard Alan, Crespi Catherine M, Seto Edmund Y W, Wang May C
University of California - Berkeley, Berkeley, CA 94704, USA.
Division of Biostatistics, University of California - Berkeley, Berkeley, CA, USA.
J Causal Inference. 2014 Mar;2(1):95-108. doi: 10.1515/jci-2013-0025.
While child and adolescent obesity is a serious public health concern, few studies have utilized parameters based on the causal inference literature to examine the potential impacts of early intervention. The purpose of this analysis was to estimate the causal effects of early interventions to improve physical activity and diet during adolescence on body mass index (BMI), a measure of adiposity, using improved techniques. The most widespread statistical method in studies of child and adolescent obesity is multi-variable regression, with the parameter of interest being the coefficient on the variable of interest. This approach does not appropriately adjust for time-dependent confounding, and the modeling assumptions may not always be met. An alternative parameter to estimate is one motivated by the causal inference literature, which can be interpreted as the mean change in the outcome under interventions to set the exposure of interest. The underlying data-generating distribution, upon which the estimator is based, can be estimated via a parametric or semi-parametric approach. Using data from the National Heart, Lung, and Blood Institute Growth and Health Study, a 10-year prospective cohort study of adolescent girls, we estimated the longitudinal impact of physical activity and diet interventions on 10-year BMI z-scores via a parameter motivated by the causal inference literature, using both parametric and semi-parametric estimation approaches. The parameters of interest were estimated with a recently released R package, ltmle, for estimating means based upon general longitudinal treatment regimes. We found that early, sustained intervention on total calories had a greater impact than a physical activity intervention or non-sustained interventions. Multivariable linear regression yielded inflated effect estimates compared to estimates based on targeted maximum-likelihood estimation and data-adaptive super learning. Our analysis demonstrates that sophisticated, optimal semiparametric estimation of longitudinal treatment-specific means via ltmle provides an incredibly powerful, yet easy-to-use tool, removing impediments for putting theory into practice.
虽然儿童和青少年肥胖是一个严重的公共卫生问题,但很少有研究使用基于因果推断文献的参数来检验早期干预的潜在影响。本分析的目的是使用改进的技术,估计青春期改善身体活动和饮食的早期干预对体重指数(BMI,一种肥胖度量指标)的因果效应。儿童和青少年肥胖研究中最广泛使用的统计方法是多变量回归,其中感兴趣的参数是感兴趣变量的系数。这种方法没有适当地调整随时间变化的混杂因素,而且建模假设可能并不总是成立。另一个可估计的参数是由因果推断文献提出的,它可以解释为在设定感兴趣暴露的干预下结果的平均变化。估计器所基于的潜在数据生成分布可以通过参数化或半参数化方法进行估计。利用美国国立心肺血液研究所生长与健康研究的数据,这是一项对青春期女孩进行的为期10年的前瞻性队列研究,我们通过因果推断文献提出的一个参数,使用参数化和半参数化估计方法,估计了身体活动和饮食干预对10年BMI z评分的纵向影响。感兴趣的参数使用最近发布的R包ltmle进行估计,该包用于基于一般纵向治疗方案估计均值。我们发现,对总热量进行早期持续干预的影响大于身体活动干预或非持续干预。与基于靶向最大似然估计和数据自适应超学习的估计相比,多变量线性回归得出的效应估计值过高。我们的分析表明,通过ltmle对纵向特定治疗均值进行复杂、最优的半参数估计提供了一个极其强大且易于使用的工具,消除了将理论付诸实践的障碍。