Šarhanová Petra, Pfanzelt Simon, Brandt Ronny, Himmelbach Axel, Blattner Frank R
Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben Germany.
Present address: Department of Botany and Biodiversity Research University of Vienna Vienna Austria.
Ecol Evol. 2018 Oct 25;8(22):10817-10833. doi: 10.1002/ece3.4533. eCollection 2018 Nov.
Microsatellites (or simple sequence repeats, SSR) are widely used markers in population genetics. Traditionally, genotyping was and still is carried out through recording fragment length. Now, next-generation sequencing (NGS) makes it easy to obtain also sequence information for the loci of interest. This avoids misinterpretations that otherwise could arise due to size homoplasy. Here, an NGS strategy is described that allows to genotype hundreds of individuals at many custom-designed SSR loci simultaneously, combining multiplex PCR, barcoding, and Illumina sequencing. We created three different datasets for which alleles were coded according to (a) length of the repetitive region, (b) total fragment length, and (c) sequence identity, in order to evaluate the eventual benefits from having sequence data at hand, not only fragment length data. For each dataset, genetic diversity statistics, as well as and values, were calculated. The number of alleles per locus, as well as observed and expected heterozygosity, was highest in the sequence identity dataset, because of single-nucleotide polymorphisms and insertions/deletions in the flanking regions of the SSR motif. Size homoplasy was found to be very common, amounting to 44.7%-63.5% (mean over all loci) in the three study species. Thus, the information obtained by next-generation sequencing offers a better resolution than the traditional way of SSR genotyping and allows for more accurate evolutionary interpretations.
微卫星(或简单序列重复,SSR)是群体遗传学中广泛使用的标记。传统上,基因分型过去是现在仍然是通过记录片段长度来进行的。现在,新一代测序(NGS)使得获取感兴趣位点的序列信息也变得容易。这避免了因大小同塑性而可能产生的错误解读。在这里,描述了一种NGS策略,该策略允许同时对许多定制设计的SSR位点的数百个个体进行基因分型,结合多重PCR、条形码和Illumina测序。我们创建了三个不同的数据集,根据(a)重复区域长度;(b)总片段长度;(c)序列同一性对等位基因进行编码,以便评估手头拥有序列数据而非仅片段长度数据所能带来的最终益处。对于每个数据集,计算了遗传多样性统计数据以及Fst和Nm值。由于SSR基序侧翼区域的单核苷酸多态性和插入/缺失,序列同一性数据集中每个位点的等位基因数量以及观察到的和预期的杂合度最高。发现大小同塑性非常普遍,在三个研究物种中占44.7%-63.5%(所有位点的平均值)。因此,新一代测序获得的信息比传统的SSR基因分型方法具有更好的分辨率,并允许进行更准确的进化解释。