CNRS, LECA UMR 5553, Univ. Grenoble Alpes, Grenoble, France.
CNRS, TIMC-IMAG UMR 5525, Univ. Grenoble Alpes, Grenoble, France.
Mol Ecol Resour. 2018 Nov;18(6):1223-1233. doi: 10.1111/1755-0998.12906. Epub 2018 Jun 17.
Ordination is a common tool in ecology that aims at representing complex biological information in a reduced space. In landscape genetics, ordination methods such as principal component analysis (PCA) have been used to detect adaptive variation based on genomic data. Taking advantage of environmental data in addition to genotype data, redundancy analysis (RDA) is another ordination approach that is useful to detect adaptive variation. This study aims at proposing a test statistic based on RDA to search for loci under selection. We compare redundancy analysis to pcadapt, which is a nonconstrained ordination method, and to a latent factor mixed model (LFMM), which is a univariate genotype-environment association method. Individual-based simulations identify evolutionary scenarios where RDA genome scans have a greater statistical power than genome scans based on PCA. By constraining the analysis with environmental variables, RDA performs better than PCA in identifying adaptive variation when selection gradients are weakly correlated with population structure. In addition, we show that if RDA and LFMM have a similar power to identify genetic markers associated with environmental variables, the RDA-based procedure has the advantage to identify the main selective gradients as a combination of environmental variables. To give a concrete illustration of RDA in population genomics, we apply this method to the detection of outliers and selective gradients on an SNP data set of Populus trichocarpa (Geraldes et al., ). The RDA-based approach identifies the main selective gradient contrasting southern and coastal populations to northern and continental populations in the north-western American coast.
排序是生态学中常用的工具,旨在将复杂的生物信息表示在一个简化的空间中。在景观遗传学中,排序方法,如主成分分析(PCA)已被用于基于基因组数据检测适应性变异。除了基因型数据外,冗余分析(RDA)是另一种排序方法,它利用环境数据来检测适应性变异。本研究旨在提出一种基于 RDA 的测试统计量,用于搜索选择下的基因座。我们比较了冗余分析与 pcadapt,后者是一种非约束性的排序方法,以及潜在因子混合模型(LFMM),这是一种单变量基因型-环境关联方法。个体模拟确定了进化场景,其中 RDA 基因组扫描比基于 PCA 的基因组扫描具有更大的统计能力。通过用环境变量约束分析,当选择梯度与种群结构弱相关时,RDA 在识别适应性变异方面比 PCA 表现更好。此外,我们还表明,如果 RDA 和 LFMM 具有相似的能力来识别与环境变量相关的遗传标记,那么基于 RDA 的程序具有将主要选择梯度识别为环境变量组合的优势。为了在群体基因组学中具体说明 RDA,我们将这种方法应用于对 Populus trichocarpa(Geraldes 等人,)SNP 数据集的异常值和选择梯度的检测。基于 RDA 的方法确定了主要的选择梯度,对比了美国西北海岸南部和沿海地区与北部和大陆地区的种群。