Gompert Zachariah, Egan Scott P, Barrett Rowan D H, Feder Jeffrey L, Nosil Patrik
Department of Biology, Utah State University, Logan, UT, 84322, USA.
Department of BioSciences, Rice University, Houston, TX, 77005, USA.
Mol Ecol. 2017 Jan;26(1):365-382. doi: 10.1111/mec.13867. Epub 2016 Oct 27.
The study of ecological speciation is inherently linked to the study of selection. Methods for estimating phenotypic selection within a generation based on associations between trait values and fitness (e.g. survival) of individuals are established. These methods attempt to disentangle selection acting directly on a trait from indirect selection caused by correlations with other traits via multivariate statistical approaches (i.e. inference of selection gradients). The estimation of selection on genotypic or genomic variation could also benefit from disentangling direct and indirect selection on genetic loci. However, achieving this goal is difficult with genomic data because the number of potentially correlated genetic loci (p) is very large relative to the number of individuals sampled (n). In other words, the number of model parameters exceeds the number of observations (p ≫ n). We present simulations examining the utility of whole-genome regression approaches (i.e. Bayesian sparse linear mixed models) for quantifying direct selection in cases where p ≫ n. Such models have been used for genome-wide association mapping and are common in artificial breeding. Our results show they hold promise for studies of natural selection in the wild and thus of ecological speciation. But we also demonstrate important limitations to the approach and discuss study designs required for more robust inferences.
生态物种形成的研究与选择的研究有着内在联系。基于个体的性状值与适合度(如存活率)之间的关联来估计一代内表型选择的方法已经确立。这些方法试图通过多变量统计方法(即选择梯度的推断),将直接作用于某一性状的选择与因与其他性状的相关性而产生的间接选择区分开来。对基因型或基因组变异的选择估计也可能受益于对基因座上直接和间接选择的区分。然而,利用基因组数据实现这一目标很困难,因为相对于所采样个体的数量(n)而言,潜在相关基因座的数量(p)非常大。换句话说,模型参数的数量超过了观测值的数量(p≫n)。我们展示了一些模拟,检验了在p≫n的情况下全基因组回归方法(即贝叶斯稀疏线性混合模型)用于量化直接选择的效用。此类模型已用于全基因组关联作图,并且在人工育种中很常见。我们的结果表明,它们在野生自然选择研究以及因此在生态物种形成研究方面具有前景。但我们也证明了该方法的重要局限性,并讨论了进行更可靠推断所需的研究设计。