Carroll Raymond J, Midthune Douglas, Freedman Laurence S, Kipnis Victor
Department of Statistics, Texas A and M University, TAMU 3143, College Station, Texas 77843-3143, USA.
Biometrics. 2006 Mar;62(1):75-84. doi: 10.1111/j.1541-0420.2005.00400.x.
Motivated by an important biomarker study in nutritional epidemiology, we consider the combination of the linear mixed measurement error model and the linear seemingly unrelated regression model, hence Seemingly Unrelated Measurement Error Models. In our context, we have data on protein intake and energy (caloric) intake from both a food frequency questionnaire (FFQ) and a biomarker, and wish to understand the measurement error properties of the FFQ for each nutrient. Our idea is to develop separate marginal mixed measurement error models for each nutrient, and then combine them into a larger multivariate measurement error model: the two measurement error models are seemingly unrelated because they concern different nutrients, but aspects of each model are highly correlated. As in any seemingly unrelated regression context, the hope is to achieve gains in statistical efficiency compared to fitting each model separately. We show that if we employ a "full" model (fully parameterized), the combination of the two measurement error models leads to no gain over considering each model separately. However, there is also a scientifically motivated "reduced" model that sets certain parameters in the "full" model equal to zero, and for which the combination of the two measurement error models leads to considerable gain over considering each model separately, e.g., 40% decrease in standard errors. We use the Akaike information criterion to distinguish between the two possibilities, and show that the resulting estimates achieve major gains in efficiency. We also describe theoretical and serious practical problems with the Bayes information criterion in this context.
受营养流行病学中一项重要生物标志物研究的启发,我们考虑线性混合测量误差模型与线性看似不相关回归模型的结合,即看似不相关测量误差模型。在我们的研究背景下,我们有来自食物频率问卷(FFQ)和生物标志物的蛋白质摄入量和能量(卡路里)摄入量的数据,并希望了解FFQ对每种营养素的测量误差特性。我们的想法是为每种营养素开发单独的边际混合测量误差模型,然后将它们组合成一个更大的多变量测量误差模型:这两个测量误差模型看似不相关,因为它们涉及不同的营养素,但每个模型的各个方面高度相关。与任何看似不相关回归的情况一样,希望与分别拟合每个模型相比能提高统计效率。我们表明,如果采用“完整”模型(完全参数化),两个测量误差模型的组合与分别考虑每个模型相比不会带来增益。然而,还有一个基于科学动机的“简化”模型,它将“完整”模型中的某些参数设为零,对于这个模型,两个测量误差模型的组合与分别考虑每个模型相比会带来显著增益,例如,标准误差降低40%。我们使用赤池信息准则来区分这两种可能性,并表明所得估计在效率上有很大提高。我们还描述了在此背景下贝叶斯信息准则的理论和严重实际问题。