van den Oord E J
Utrecht University, Utrecht, The Netherlands.
Stat Methods Med Res. 2001 Dec;10(6):393-407. doi: 10.1177/096228020101000603.
Multilevel modelling is a data analysis technique for analysing linear models in samples with a hierarchical or clustered structure. Clustered data are often present in genetic research where family members may either be required or serve a methodological purpose to study hereditary factors. These samples imply a natural hierarchy because genetically related individuals are grouped within families. We first demonstrate the use of multilevel modelling to study latent genetic and environmental components of variance in extended families where subjects may be related as twins, full siblings, half siblings, or cousins. Next, measured genotypes are included to estimate locus effects. Because the model accounts for the clustering of observations by estimating a random intercept at the family level, it tests for genotype effects on the phenotype within families so that possible population stratification effects cannot cause false positive results. Several extensions are discussed such as testing for genotype-environment interactions, analysing different types of response scales, or tailoring the model to other sample structures. To illustrate the approach we used birth weight data of 5562 children from 3643 fathers from 3186 mothers in 2873 extended families to which simulated genotypes of a hypothetical locus were added.
多层建模是一种用于分析具有分层或聚类结构样本中的线性模型的数据分析技术。聚类数据在基因研究中经常出现,在基因研究中,家庭成员可能是研究遗传因素所必需的,或者具有方法学上的用途。这些样本意味着一种自然的层次结构,因为有遗传关系的个体被分组在家庭中。我们首先展示如何使用多层建模来研究大家庭中方差的潜在遗传和环境成分,在这些大家庭中,个体之间的关系可能是双胞胎、全同胞、半同胞或堂兄弟姐妹。接下来,纳入测量的基因型以估计基因座效应。由于该模型通过估计家庭层面的随机截距来考虑观测值的聚类,因此它会在家庭内部检验基因型对表型的效应,从而使可能的群体分层效应不会导致假阳性结果。文中还讨论了几个扩展内容,如检验基因型与环境的相互作用、分析不同类型的反应量表,或使模型适用于其他样本结构。为了说明该方法,我们使用了来自2873个大家庭中3186名母亲所生的3643名父亲的5562名儿童的出生体重数据,并添加了一个假设基因座的模拟基因型。