Ordoñez José A, Bandyopadhyay Dipankar, Lachos Victor H, Cabral Celso R B
Department of Statistics, Campinas State University, Campinas, São Paulo, Brazil.
Department of Biostatistics, Virginia Commonwealth University, Richmond, VA, U.S.A.
Spat Stat. 2018 Mar;23:109-123. doi: 10.1016/j.spasta.2017.12.001. Epub 2017 Dec 12.
Spatially-referenced geostatistical responses that are collected in environmental sciences research are often subject to detection limits, where the measures are not fully quantifiable. This leads to censoring (left, right, interval, etc), and various ad hoc statistical methods (such as choosing arbitrary detection limits, or data augmentation) are routinely employed during subsequent statistical analysis for inference and prediction. However, inference may be imprecise and sensitive to the assumptions and approximations involved in those arbitrary choices. To circumvent this, we propose an maximum likelihood estimation framework of the fixed effects and variance components and related prediction via a novel application of the Stochastic Approximation of the Expectation Maximization (SAEM) algorithm, allowing for easy and elegant estimation of model parameters under censoring. Both simulation studies and application to a real dataset on arsenic concentration collected by the Michigan Department of Environmental Quality demonstrate the advantages of our method over the available naïve techniques in terms of finite sample properties of the estimates, prediction, and robustness. The proposed methods can be implemented using the R package CensSpatial.
环境科学研究中收集的空间参考地质统计响应通常受到检测限的影响,即这些测量无法完全量化。这会导致删失(左删失、右删失、区间删失等),并且在后续的统计分析中,为了进行推断和预测,通常会采用各种临时统计方法(例如选择任意检测限或数据增强)。然而,推断可能不准确,并且对这些任意选择所涉及的假设和近似值很敏感。为了规避这一问题,我们通过期望最大化随机近似(SAEM)算法的一种新应用,提出了固定效应和方差分量的最大似然估计框架以及相关预测方法,从而能够在删失情况下轻松而优雅地估计模型参数。模拟研究以及对密歇根州环境质量部收集的砷浓度真实数据集的应用均表明,我们的方法在估计、预测和稳健性的有限样本性质方面优于现有的简单技术。所提出的方法可以使用R包CensSpatial来实现。