Jacob Benjamin G, Novak Robert J, Toe Laurent, Sanfo Moussa S, Afriyie Abena N, Ibrahim Mohammed A, Griffith Daniel A, Unnasch Thomas R
Global Infectious Disease Research Program, Department of Public Health, College of Public Health, University of South Florida, 3720 Spectrum Blvd, Suite 304, Tampa, Florida 33612, USA.
Geo Spat Inf Sci. 2012;15(2):117-133. doi: 10.1080/10095020.2012.714663. Epub 2012 Sep 24.
The standard methods for regression analyses of clustered riverine larval habitat data of a major black-fly vector of Onchoceriasis, postulate models relating observational ecological-sampled parameter estimators to prolific habitats without accounting for residual intra-cluster error correlation effects. Generally, this correlation comes from two sources: (1) the design of the random effects and their assumed covariance from the multiple levels within the regression model; and, (2) the correlation structure of the residuals. Unfortunately, inconspicuous errors in residual intra-cluster correlation estimates can overstate precision in forecasted riverine larval habitat explanatory attributes regardless how they are treated (e.g., independent, autoregressive, Toeplitz, etc). In this research, the geographical locations for multiple riverine-based larval ecosystem habitats sampled from 2 pre-established epidemiological sites in Togo were identified and recorded from July 2009 to June 2010. Initially the data was aggregated into proc genmod. An agglomerative hierarchical residual cluster-based analysis was then performed. The sampled clustered study site data was then analyzed for statistical correlations using Monthly Biting Rates (MBR). Euclidean distance measurements and terrain-related geomorphological statistics were then generated in ArcGIS. A digital overlay was then performed also in ArcGIS using the georeferenced ground coordinates of high and low density clusters stratified by Annual Biting Rates (ABR). This data was overlain onto multitemporal sub-meter pixel resolution satellite data (i.e., QuickBird 0.61m wavbands ). Orthogonal spatial filter eigenvectors were then generated in SAS/GIS. Univariate and non-linear regression-based models (i.e., Logistic, Poisson and Negative Binomial) were also employed to determine probability distributions and to identify statistically significant parameter estimators from the sampled data. Thereafter, Durbin-Watson test statistics were used to test the null hypothesis that the regression residuals were not autocorrelated against the alternative that the residuals followed an autoregressive process in AUTOREG. Bayesian uncertainty matrices were also constructed employing normal priors for each of the sampled estimators in PROC MCMC. The residuals revealed both spatially structured and unstructured error effects in the high and low ABR-stratified clusters. The analyses also revealed that the estimators, levels of turbidity and presence of rocks were statistically significant for the high-ABR-stratified clusters, while the estimators distance between habitats and floating vegetation were important for the low-ABR-stratified cluster. Varying and constant coefficient regression models, ABR- stratified GIS-generated clusters, sub-meter resolution satellite imagery, a robust residual intra-cluster diagnostic test, MBR-based histograms, eigendecomposition spatial filter algorithms and Bayesian matrices can enable accurate autoregressive estimation of latent uncertainity affects and other residual error probabilities (i.e., heteroskedasticity) for testing correlations between georeferenced riverine larval habitat estimators. The asymptotic distribution of the resulting residual adjusted intra-cluster predictor error autocovariate coefficients can thereafter be established while estimates of the asymptotic variance can lead to the construction of approximate confidence intervals for accurately targeting productive habitats based on spatiotemporal field-sampled count data.
盘尾丝虫病主要黑蝇传播媒介的河流幼虫栖息地聚集数据的回归分析标准方法,假定了将观测到的生态采样参数估计值与多产栖息地相关的模型,却未考虑聚类内残差误差相关效应。一般来说,这种相关性来自两个来源:(1)随机效应的设计及其在回归模型中多个层次的假定协方差;以及(2)残差的相关结构。不幸的是,聚类内残差相关性估计中不明显的误差可能会夸大预测的河流幼虫栖息地解释属性的精度,无论如何处理这些误差(例如,独立、自回归、托普利兹等)。在本研究中,从多哥2个预先建立的流行病学站点采集的多个基于河流的幼虫生态系统栖息地的地理位置,于2009年7月至2010年6月被识别并记录。最初,数据被汇总到proc genmod中。然后进行了基于凝聚层次残差聚类的分析。然后使用月叮咬率(MBR)对采样的聚类研究站点数据进行统计相关性分析。然后在ArcGIS中生成欧几里得距离测量值和与地形相关的地貌统计数据。还在ArcGIS中使用按年叮咬率(ABR)分层的高密度和低密度聚类的地理参考地面坐标进行了数字叠加。然后在SAS/GIS中生成正交空间滤波器特征向量。还采用了基于单变量和非线性回归的模型(即逻辑、泊松和负二项式)来确定概率分布,并从采样数据中识别具有统计学意义的参数估计值。此后,使用德宾 - 沃森检验统计量来检验原假设,即回归残差不存在自相关,以反对备择假设,即残差遵循自回归过程(在AUTOREG中)。还在PROC MCMC中为每个采样估计量采用正态先验构建了贝叶斯不确定性矩阵。残差在高ABR分层聚类和低ABR分层聚类中显示出空间结构化和非结构化误差效应。分析还表明,对于高ABR分层聚类,估计量、浊度水平和岩石的存在具有统计学意义,而对于低ABR分层聚类,栖息地之间的距离和漂浮植被的估计量很重要。可变和恒定系数回归模型、ABR分层的GIS生成聚类、亚米分辨率卫星图像、强大的聚类内残差诊断测试、基于MBR的直方图、特征分解空间滤波器算法和贝叶斯矩阵,可以实现对潜在不确定性影响和其他残差误差概率(即异方差性)的准确自回归估计,以检验地理参考河流幼虫栖息地估计量之间的相关性。此后,可以建立所得残差调整后的聚类内预测误差自协变量系数的渐近分布,而渐近方差估计可导致构建近似置信区间,以便基于时空现场采样计数数据准确瞄准多产栖息地。