Sarkar Abhra, Mallick Bani K, Carroll Raymond J
Department of Statistics, Texas A&M University, College Station, Texas 77843-3143, U.S.A.
Biometrics. 2014 Dec;70(4):823-34. doi: 10.1111/biom.12197. Epub 2014 Jun 25.
We consider the problem of robust estimation of the regression relationship between a response and a covariate based on sample in which precise measurements on the covariate are not available but error-prone surrogates for the unobserved covariate are available for each sampled unit. Existing methods often make restrictive and unrealistic assumptions about the density of the covariate and the densities of the regression and the measurement errors, for example, normality and, for the latter two, also homoscedasticity and thus independence from the covariate. In this article we describe Bayesian semiparametric methodology based on mixtures of B-splines and mixtures induced by Dirichlet processes that relaxes these restrictive assumptions. In particular, our models for the aforementioned densities adapt to asymmetry, heavy tails and multimodality. The models for the densities of regression and measurement errors also accommodate conditional heteroscedasticity. In simulation experiments, our method vastly outperforms existing methods. We apply our method to data from nutritional epidemiology.
在该样本中,协变量的精确测量值不可得,但对于每个抽样单元,可获得未观测协变量的易出错替代变量。现有方法通常对协变量的密度以及回归和测量误差的密度做出限制性且不切实际的假设,例如正态性,并且对于后两者,还假设同方差性以及因此与协变量无关。在本文中,我们描述了基于B样条混合和狄利克雷过程诱导混合的贝叶斯半参数方法,该方法放宽了这些限制性假设。特别是,我们针对上述密度的模型适应不对称性、重尾性和多峰性。回归和测量误差密度的模型也考虑了条件异方差性。在模拟实验中,我们的方法大大优于现有方法。我们将我们的方法应用于营养流行病学数据。