Zhang Wenyang, Fan Jianqing, Sun Yan
Department of Mathematical Sciences, University of Bath, UK.
Ann Stat. 2009 Oct 1;37(5A):2377-2408. doi: 10.1214/08-AOS662.
In the analysis of cluster data the regression coefficients are frequently assumed to be the same across all clusters. This hampers the ability to study the varying impacts of factors on each cluster. In this paper, a semiparametric model is introduced to account for varying impacts of factors over clusters by using cluster-level covariates. It achieves the parsimony of parametrization and allows the explorations of nonlinear interactions. The random effect in the semiparametric model accounts also for within cluster correlation. Local linear based estimation procedure is proposed for estimating functional coefficients, residual variance, and within cluster correlation matrix. The asymptotic properties of the proposed estimators are established and the method for constructing simultaneous confidence bands are proposed and studied. In addition, relevant hypothesis testing problems are addressed. Simulation studies are carried out to demonstrate the methodological power of the proposed methods in the finite sample. The proposed model and methods are used to analyse the second birth interval in Bangladesh, leading to some interesting findings.
在聚类数据分析中,通常假定所有聚类的回归系数相同。这妨碍了研究因素对每个聚类的不同影响的能力。本文引入了一种半参数模型,通过使用聚类水平协变量来考虑因素在聚类间的不同影响。它实现了参数化的简约性,并允许探索非线性相互作用。半参数模型中的随机效应也考虑了聚类内相关性。提出了基于局部线性的估计程序,用于估计函数系数、残差方差和聚类内相关矩阵。建立了所提出估计量的渐近性质,并提出并研究了构建同时置信带的方法。此外,还讨论了相关的假设检验问题。进行了模拟研究,以证明所提出方法在有限样本中的方法效力。所提出的模型和方法被用于分析孟加拉国的第二次生育间隔,得出了一些有趣的发现。