Zhang Saijuan, Midthune Douglas, Guenther Patricia M, Krebs-Smith Susan M, Kipnis Victor, Dodd Kevin W, Buckman Dennis W, Tooze Janet A, Freedman Laurence, Carroll Raymond J
Department of Statistics Texas A&M University 3143 TAMU College Station, Texas 77843-3143 U.S.A.
Ann Appl Stat. 2011 Jun 1;5(2B):1456-1487. doi: 10.1214/10-AOAS446.
In the United States the preferred method of obtaining dietary intake data is the 24-hour dietary recall, yet the measure of most interest is usual or long-term average daily intake, which is impossible to measure. Thus, usual dietary intake is assessed with considerable measurement error. Also, diet represents numerous foods, nutrients and other components, each of which have distinctive attributes. Sometimes, it is useful to examine intake of these components separately, but increasingly nutritionists are interested in exploring them collectively to capture overall dietary patterns. Consumption of these components varies widely: some are consumed daily by almost everyone on every day, while others are episodically consumed so that 24-hour recall data are zero-inflated. In addition, they are often correlated with each other. Finally, it is often preferable to analyze the amount of a dietary component relative to the amount of energy (calories) in a diet because dietary recommendations often vary with energy level. The quest to understand overall dietary patterns of usual intake has to this point reached a standstill. There are no statistical methods or models available to model such complex multivariate data with its measurement error and zero inflation. This paper proposes the first such model, and it proposes the first workable solution to fit such a model. After describing the model, we use survey-weighted MCMC computations to fit the model, with uncertainty estimation coming from balanced repeated replication.The methodology is illustrated through an application to estimating the population distribution of the Healthy Eating Index-2005 (HEI-2005), a multi-component dietary quality index involving ratios of interrelated dietary components to energy, among children aged 2-8 in the United States. We pose a number of interesting questions about the HEI-2005 and provide answers that were not previously within the realm of possibility, and we indicate ways that our approach can be used to answer other questions of importance to nutritional science and public health.
在美国,获取饮食摄入数据的首选方法是24小时饮食回顾法,但最受关注的指标是通常或长期的平均每日摄入量,而这是无法测量的。因此,通常饮食摄入量的评估存在相当大的测量误差。此外,饮食包含众多食物、营养素和其他成分,每种成分都有独特的属性。有时,分别研究这些成分的摄入量是有用的,但越来越多的营养学家有兴趣将它们综合起来研究,以捕捉整体饮食模式。这些成分的消费量差异很大:有些几乎每个人每天都食用,而有些则偶尔食用,因此24小时回顾数据存在零膨胀现象。此外,它们之间往往相互关联。最后,相对于饮食中的能量(卡路里)量来分析饮食成分的量通常更可取,因为饮食建议通常随能量水平而变化。到目前为止,了解通常摄入量的整体饮食模式的探索陷入了停滞。没有可用的统计方法或模型来对具有测量误差和零膨胀的如此复杂的多变量数据进行建模。本文提出了第一个这样的模型,并提出了第一个可行的解决方案来拟合这样的模型。在描述模型之后,我们使用调查加权的马尔可夫链蒙特卡罗计算来拟合模型,不确定性估计来自平衡重复复制。通过应用该方法来估计美国2 - 8岁儿童中健康饮食指数 - 2005(HEI - 2005)的总体分布来说明该方法,HEI - 2005是一个多成分饮食质量指数,涉及相关饮食成分与能量的比率。我们提出了一些关于HEI - 2005的有趣问题,并提供了以前不可能得到的答案,我们还指出了我们的方法可用于回答其他对营养科学和公共卫生重要问题的方式。