Chao Edward C
Insightful Corporation, 1700 Westlake Avenue N. Suite 500, Seattle, WA 98109, USA.
Stat Med. 2006 Jul 30;25(14):2450-68. doi: 10.1002/sim.2368.
Correlation is always a concern in the analysis of clustered data. One area of interest is to develop a general correlation modelling approach for high dimensional data with unbalanced hierarchical and heterogeneous data structures, e.g. multilevel data. Commonly used correlation structures might have limitation for such situations. In this paper, we propose two extensions, multiblock and multilayer correlations. These methods are very flexible in modelling correlation and can be incorporated in many multivariate approaches, while the major discussion focuses on the applications under the generalized estimating equations (GEE) methods. The approaches are especially useful in GEE when each cluster is large and complex but the number of clusters is small. If an incorrect correlation is applied to such data, the results are less efficient. Multiblock and multilayer correlations extend GEE methods to model complicated multilevel data with arbitrary number of levels and cluster size. The extended estimating equation for correlation parameters has an orthogonal property, and the computation is very efficient. A simulation study compares the conventional methods versus the proposed methods, and it shows the gain in relative efficiency and the flexibility in modelling various structures.
在聚类数据的分析中,相关性始终是一个需要关注的问题。一个感兴趣的领域是为具有不平衡分层和异构数据结构的高维数据(例如多级数据)开发一种通用的相关性建模方法。常用的相关结构在这种情况下可能存在局限性。在本文中,我们提出了两种扩展方法,即多块相关和多层相关。这些方法在相关性建模方面非常灵活,可以纳入许多多变量方法中,而主要讨论集中在广义估计方程(GEE)方法下的应用。当每个聚类大且复杂但聚类数量较少时,这些方法在GEE中特别有用。如果将不正确的相关性应用于此类数据,结果的效率会较低。多块相关和多层相关将GEE方法扩展到对具有任意层数和聚类大小的复杂多级数据进行建模。相关参数的扩展估计方程具有正交性,并且计算效率非常高。一项模拟研究比较了传统方法与所提出的方法,并显示了相对效率的提高以及在建模各种结构方面的灵活性。