Chen Haiying, Quandt Sara A, Grzywacz Joseph G, Arcury Thomas A
Department of Biostatistical Sciences, Division of Public Health Sciences, Wake Forest School of Medicine, Winston-Salem, North Carolina ; Center for Worker Health, Wake Forest School of Medicine, Winston-Salem, North Carolina.
Environmetrics. 2013 Mar;24(2):132-142. doi: 10.1002/env.2193. Epub 2012 Dec 20.
Environmental and biomedical research often produces data below the limit of detection (LOD), or left-censored data. Imputing explicit values for values < LOD in a multivariate setting, such as with longitudinal data, is difficult using a likelihood-based approach. A Bayesian multiple imputation (MI) method is introduced to handle left-censored multivariate data. A Gibbs sampler, which uses an iterative process, is employed to simulate the target multivariate distribution within a Bayesian framework. Following convergence, multiple plausible data sets are generated for analysis by standard statistical methods outside of a Bayesian framework. With explicit imputed values available variables can be analyzed as outcomes or predictors. We illustrate a practical application using longitudinal data from the Community Participatory Approach to Measuring Farmworker Pesticide Exposure (PACE3) study to evaluate the association between urinary acephate concentrations (indicating pesticide exposure) and self-reported potential pesticide poisoning symptoms. Additionally, a simulation study is used to evaluate the sampling property of the estimators for distributional parameters as well as regression coefficients estimated with the generalized estimating equation (GEE) approach. Results demonstrated that the Bayesian MI estimates performed well in most settings, and we recommend the use of this valid and feasible approach to analyze multivariate data with values < LOD.
环境和生物医学研究常常会产生低于检测限(LOD)的数据,即左删失数据。在多变量环境中,如处理纵向数据时,使用基于似然性的方法为低于LOD的值估算具体数值是很困难的。本文引入了一种贝叶斯多重填补(MI)方法来处理左删失多变量数据。在贝叶斯框架内,使用一个迭代过程的吉布斯采样器来模拟目标多变量分布。收敛后,会生成多个合理的数据集,以便在贝叶斯框架外通过标准统计方法进行分析。有了明确的填补值后,变量就可以作为结果或预测因子进行分析。我们通过社区参与式农场工人农药暴露测量方法(PACE3)研究中的纵向数据展示了一个实际应用,以评估尿中乙酰甲胺磷浓度(表明农药暴露情况)与自我报告的潜在农药中毒症状之间的关联。此外,还进行了一项模拟研究,以评估分布参数估计量以及用广义估计方程(GEE)方法估计的回归系数的抽样特性。结果表明,贝叶斯MI估计在大多数情况下表现良好,我们建议使用这种有效且可行的方法来分析存在低于LOD值的多变量数据。