Bai Yun, Kang Jian, Song Peter X-K
Department of Biostatistics, University of Michigan, Ann Arbor, Michigan, U.S.A.
Department of Biostatistics and Bioinformatics, Emory University, Atlanta, Georgia, U.S.A.
Biometrics. 2014 Sep;70(3):661-70. doi: 10.1111/biom.12199. Epub 2014 Jun 19.
Spatial-clustered data refer to high-dimensional correlated measurements collected from units or subjects that are spatially clustered. Such data arise frequently from studies in social and health sciences. We propose a unified modeling framework, termed as GeoCopula, to characterize both large-scale variation, and small-scale variation for various data types, including continuous data, binary data, and count data as special cases. To overcome challenges in the estimation and inference for the model parameters, we propose an efficient composite likelihood approach in that the estimation efficiency is resulted from a construction of over-identified joint composite estimating equations. Consequently, the statistical theory for the proposed estimation is developed by extending the classical theory of the generalized method of moments. A clear advantage of the proposed estimation method is the computation feasibility. We conduct several simulation studies to assess the performance of the proposed models and estimation methods for both Gaussian and binary spatial-clustered data. Results show a clear improvement on estimation efficiency over the conventional composite likelihood method. An illustrative data example is included to motivate and demonstrate the proposed method.
空间聚类数据是指从空间聚类的单元或个体中收集的高维相关测量值。这类数据在社会科学和健康科学研究中经常出现。我们提出了一个统一的建模框架,称为地理Copula,以刻画各种数据类型(包括连续数据、二元数据和计数数据等特殊情况)的大规模变异和小规模变异。为了克服模型参数估计和推断中的挑战,我们提出了一种有效的复合似然方法,其估计效率源于构建超识别联合复合估计方程。因此,通过扩展广义矩方法的经典理论,发展了所提出估计的统计理论。所提出估计方法的一个明显优点是计算可行性。我们进行了几项模拟研究,以评估所提出的模型和估计方法对高斯和二元空间聚类数据的性能。结果表明,与传统复合似然方法相比,估计效率有明显提高。还给出了一个说明性数据示例,以激发和演示所提出的方法。