Department of Evolution and Ecology and Center for Population Biology, University of California, Davis, Calfornia 95616, USA.
Genetics. 2010 Aug;185(4):1411-23. doi: 10.1534/genetics.110.114819. Epub 2010 Jun 1.
Loci involved in local adaptation can potentially be identified by an unusual correlation between allele frequencies and important ecological variables or by extreme allele frequency differences between geographic regions. However, such comparisons are complicated by differences in sample sizes and the neutral correlation of allele frequencies across populations due to shared history and gene flow. To overcome these difficulties, we have developed a Bayesian method that estimates the empirical pattern of covariance in allele frequencies between populations from a set of markers and then uses this as a null model for a test at individual SNPs. In our model the sample frequencies of an allele across populations are drawn from a set of underlying population frequencies; a transform of these population frequencies is assumed to follow a multivariate normal distribution. We first estimate the covariance matrix of this multivariate normal across loci using a Monte Carlo Markov chain. At each SNP, we then provide a measure of the support, a Bayes factor, for a model where an environmental variable has a linear effect on the transformed allele frequencies compared to a model given by the covariance matrix alone. This test is shown through power simulations to outperform existing correlation tests. We also demonstrate that our method can be used to identify SNPs with unusually large allele frequency differentiation and offers a powerful alternative to tests based on pairwise or global F(ST). Software is available at http://www.eve.ucdavis.edu/gmcoop/.
涉及局部适应的基因座可以通过等位基因频率与重要生态变量之间的异常相关性或地理区域之间的极端等位基因频率差异来识别。然而,由于共同的历史和基因流,这些比较受到样本量和等位基因频率在种群之间的中性相关性的差异的影响。为了克服这些困难,我们开发了一种贝叶斯方法,该方法可以从一组标记物中估计种群之间等位基因频率协方差的经验模式,然后将其用作单个 SNP 测试的零模型。在我们的模型中,种群之间一个等位基因的样本频率是从一组潜在的种群频率中抽取的; 这些种群频率的变换被假定遵循多元正态分布。我们首先使用蒙特卡罗马尔可夫链估计跨基因座的这个多元正态的协方差矩阵。然后,在每个 SNP 处,我们提供了一个环境变量对变换的等位基因频率有线性影响的模型的支持度量,即贝叶斯因子,与仅由协方差矩阵给出的模型相比。通过功率模拟证明,该测试优于现有相关性测试。我们还表明,我们的方法可用于识别具有异常大的等位基因频率分化的 SNP,并且为基于成对或全局 F(ST)的测试提供了强大的替代方法。软件可在 http://www.eve.ucdavis.edu/gmcoop/ 获得。