Rönnegård Lars, McFarlane S Eryn, Husby Arild, Kawakami Takeshi, Ellegren Hans, Qvarnström Anna
Department of Clinical Sciences Swedish University of Agricultural Sciences SE-75007 Uppsala Sweden.
Department of Animal Ecology Evolutionary Biology Centre (EBC) Uppsala University Norbyvägen 18D SE-75236 Uppsala Sweden.
Methods Ecol Evol. 2016 Jul;7(7):792-799. doi: 10.1111/2041-210X.12535. Epub 2016 Feb 5.
Genomewide association studies (GWAS) enable detailed dissections of the genetic basis for organisms' ability to adapt to a changing environment. In long-term studies of natural populations, individuals are often marked at one point in their life and then repeatedly recaptured. It is therefore essential that a method for GWAS includes the process of repeated sampling. In a GWAS, the effects of thousands of single-nucleotide polymorphisms (SNPs) need to be fitted and any model development is constrained by the computational requirements. A method is therefore required that can fit a highly hierarchical model and at the same time is computationally fast enough to be useful.Our method fits fixed SNP effects in a linear mixed model that can include both random polygenic effects and permanent environmental effects. In this way, the model can correct for population structure and model repeated measures. The covariance structure of the linear mixed model is first estimated and subsequently used in a generalized least squares setting to fit the SNP effects. The method was evaluated in a simulation study based on observed genotypes from a long-term study of collared flycatchers in Sweden.The method we present here was successful in estimating permanent environmental effects from simulated repeated measures data. Additionally, we found that especially for variable phenotypes having large variation between years, the repeated measurements model has a substantial increase in power compared to a model using average phenotypes as a response.The method is available in the r package RepeatABEL. It increases the power in GWAS having repeated measures, especially for long-term studies of natural populations, and the R implementation is expected to facilitate modelling of longitudinal data for studies of both animal and human populations.
全基因组关联研究(GWAS)能够详细剖析生物体适应不断变化环境的能力的遗传基础。在对自然种群的长期研究中,个体通常在其生命中的某个时刻被标记,然后被反复重新捕获。因此,GWAS的方法必须包括重复采样的过程。在GWAS中,需要拟合数千个单核苷酸多态性(SNP)的效应,并且任何模型开发都受到计算要求的限制。因此,需要一种能够拟合高度分层模型并且同时计算速度足够快以有用的方法。我们的方法在一个线性混合模型中拟合固定的SNP效应,该模型可以包括随机多基因效应和永久环境效应。通过这种方式,该模型可以校正种群结构并对重复测量进行建模。首先估计线性混合模型的协方差结构,随后将其用于广义最小二乘设置中以拟合SNP效应。该方法在一项基于瑞典白领姬鹟长期研究中观察到的基因型的模拟研究中进行了评估。我们在此提出的方法成功地从模拟的重复测量数据中估计了永久环境效应。此外,我们发现,特别是对于年份间变化较大的可变表型,与使用平均表型作为响应的模型相比,重复测量模型的功效有大幅提高。该方法可在R包RepeatABEL中获得。它提高了具有重复测量的GWAS的功效,特别是对于自然种群的长期研究,并且R实现有望促进对动物和人类种群研究的纵向数据建模。