Wageningen UR Plant Breeding, Wageningen University & Research Centre, P.O. Box 386, NL-6700AJ, Wageningen, the Netherlands; C.T. de Wit Graduate School for Production Ecology and Resource Conservation (PE&RC), Wageningen, the Netherlands.
Mol Ecol Resour. 2015 Jan;15(1):17-27. doi: 10.1111/1755-0998.12289. Epub 2014 Jun 28.
The first hurdle in developing microsatellite markers, cloning, has been overcome by next-generation sequencing. The second hurdle is testing to differentiate polymorphic from nonpolymorphic loci. The third hurdle, somewhat hidden, is that only polymorphic markers with a large effective number of alleles are sufficiently informative to be deployed in multiple studies. Both steps are laborious and still performed manually. We have developed a strategy in which we first screen reads from multiple genotypes for repeats that show the most length variants, and only these are subsequently developed into markers. We validated our strategy in tetraploid garden rose using Illumina paired-end transcriptome sequences of 11 roses. Of 48 tested two markers failed to amplify, but all others were polymorphic. Ten loci amplified more than one locus, indicating duplicated genes or gene families. Completely avoiding duplicated loci will be difficult because the range of numbers of predicted alleles of highly polymorphic single- and multilocus markers largely overlapped. Of the remainder, half were replicate markers (i.e. multiple primer pairs for one locus), indicating the difficulty of correctly filtering short reads containing repeat sequences. We subsequently refined the approach to eliminate multiple primer sets to the same loci. The remaining 18 markers were all highly polymorphic, amplifying on average 11.7 alleles per marker (range = 6-20) in 11 tetraploid roses, exceeding the 8.2 alleles per marker of the 24 most polymorphic markers genotyped previously. This strategy therefore represents a major step forward in the development of highly polymorphic microsatellite markers.
开发微卫星标记的第一个障碍,即克隆,已经被下一代测序所克服。第二个障碍是测试以区分多态性和非多态性位点。第三个障碍有些隐蔽,即只有具有大量有效等位基因的多态性标记才具有足够的信息量,可在多项研究中使用。这两个步骤都很费力,仍然需要手动完成。我们开发了一种策略,首先从多个基因型的读取中筛选出显示最多长度变异的重复序列,然后仅对这些重复序列进行后续开发以制成标记。我们使用 11 种玫瑰的 Illumina 配对末端转录组序列在四倍体园林玫瑰中验证了我们的策略。在 48 个测试的标记中,有 2 个标记未能扩增,但其他所有标记均为多态性。有 10 个位点扩增出不止一个位点,表明存在重复基因或基因家族。由于高度多态性单标记和多标记预测等位基因的数量范围很大程度上重叠,因此完全避免重复位点将是困难的。其余标记中有一半是重复标记(即一个位点的多个引物对),这表明正确过滤包含重复序列的短读取的难度。我们随后改进了方法,以消除同一基因座的多个引物组。其余 18 个标记均具有高度多态性,在 11 个四倍体玫瑰中平均每个标记扩增 11.7 个等位基因(范围= 6-20),超过了之前 24 个多态性标记中每个标记 8.2 个等位基因。因此,这种策略代表了开发高度多态性微卫星标记的重要一步。