INRA, CNRS, Université Côte d'Azur, ISA, Paris, France.
Heredity (Edinb). 2018 Jun;120(6):485-499. doi: 10.1038/s41437-017-0042-1. Epub 2018 Jan 17.
Population genetic methods are widely used to retrace the introduction routes of invasive species. The unsupervised Bayesian clustering algorithm implemented in STRUCTURE is amongst the most frequently used of these methods, but its ability to provide reliable information about introduction routes has never been assessed. We simulated microsatellite datasets to evaluate the extent to which the results provided by STRUCTURE were misleading for the inference of introduction routes. We focused on an invasion scenario involving one native and two independently introduced populations, because it is the sole scenario that can be rejected when obtaining a particular clustering with a STRUCTURE analysis at K = 2 (two clusters). Results were classified as "misleading" or "non-misleading". We investigated the influence of effective size, bottleneck severity and number of loci on the type and frequency of misleading results. We showed that misleading STRUCTURE results were obtained for 10% of all simulated datasets. Our results highlighted two categories of misleading output. The first occurs when the native population has a low level of diversity. In this case, the two introduced populations may be very similar, despite their independent introduction histories. The second category results from convergence issues in STRUCTURE for K = 2, with strong bottleneck severity and/or large numbers of loci resulting in high levels of differentiation between the three populations. Overall, the risk of being misled by STRUCTURE in the context of introduction routes inferences is moderate, but it is important to remain cautious when low genetic diversity or genuine multimodality between runs are involved.
群体遗传方法被广泛用于追溯入侵物种的引入途径。结构中实现的无监督贝叶斯聚类算法是这些方法中最常用的方法之一,但它提供关于引入途径的可靠信息的能力从未得到过评估。我们模拟了微卫星数据集,以评估结构提供的结果在推断引入途径方面误导的程度。我们专注于一种入侵情景,涉及一个本地种群和两个独立引入的种群,因为当在结构分析中获得特定的聚类 K=2(两个聚类)时,这是唯一可以被拒绝的情景。结果被分类为“误导”或“非误导”。我们研究了有效大小、瓶颈严重程度和位点数量对误导结果的类型和频率的影响。我们表明,误导性的结构结果在所有模拟数据集中占 10%。我们的结果突出了两种误导性输出的类别。第一种情况发生在本地种群多样性水平较低时。在这种情况下,尽管引入历史独立,但两个引入种群可能非常相似。第二类情况是由于结构在 K=2 时的收敛问题引起的,强烈的瓶颈严重程度和/或大量的位点导致三个种群之间高度分化。总体而言,在引入途径推断中,结构可能会产生误导的风险是中等的,但在涉及低遗传多样性或真实运行之间的多峰性时,保持谨慎是很重要的。