Kärkkāinen Hanni P, Sillanpää Mikko J
Department of Agricultural Sciences, University of Helsinki, Helsinki FIN-00014, Finland.
Ann Hum Genet. 2012 Nov;76(6):510-23. doi: 10.1111/j.1469-1809.2012.00729.x. Epub 2012 Sep 12.
Population-based association analyses are more powerful than within-family analyses in identifying genetic loci associated with a phenotype of interest. However, if the population or sample structure is omitted from the model, population stratification and cryptic relatedness may lead to false positive and negative signals caused by relatedness between individuals, rather than association due to close linkage of the marker and the trait loci. Therefore it is important to correct or account for these confounders in population-based association analyses. However, there is cumulative evidence that when fitting a multilocus association model, the genetic relationships between the individuals can be captured by the markers themselves, bringing about a possibility to use the models without an additional correction for the population or sample structure. In this work we have further investigated this possibility in the Bayesian multilocus association model context using the extended Bayesian LASSO and the indicator-based variable selection. In particular, we have studied whether these multilocus models benefit from an insertion of an additional polygenic term representing the genetic variation not captured by the markers and taking account of the residual dependencies between the individuals. We have found that although the models may benefit from the insertion of the polygenic component, omitting the component does not damage the model performance severely.
在识别与感兴趣的表型相关的基因座方面,基于人群的关联分析比家系内分析更具效力。然而,如果模型中忽略了人群或样本结构,人群分层和隐匿相关性可能会导致因个体间相关性而非标记与性状基因座紧密连锁导致的关联所引起的假阳性和假阴性信号。因此,在基于人群的关联分析中校正或考虑这些混杂因素很重要。然而,有越来越多的证据表明,在拟合多基因座关联模型时,个体之间的遗传关系可以由标记本身捕获,从而有可能在不对人群或样本结构进行额外校正的情况下使用这些模型。在这项工作中,我们使用扩展的贝叶斯套索和基于指标的变量选择,在贝叶斯多基因座关联模型背景下进一步研究了这种可能性。特别是,我们研究了这些多基因座模型是否受益于插入一个额外的多基因项,该多基因项代表未被标记捕获的遗传变异,并考虑个体之间的残余依赖性。我们发现,虽然模型可能会受益于多基因成分的插入,但省略该成分不会严重损害模型性能。