Department of Horticulture, University of Arkansas, Fayetteville, AR, 72701, USA.
Texas A&M AgriLife Research and Extension Center, Weslaco, TX, 78596, USA.
Sci Rep. 2021 May 11;11(1):9999. doi: 10.1038/s41598-021-89473-0.
The availability of well-assembled genome sequences and reduced sequencing costs have enabled the resequencing of many additional accessions in several crops, thus facilitating the rapid discovery and development of simple sequence repeat (SSR) markers. Although the genome sequence of inbred spinach line Sp75 is available, previous efforts have resulted in a limited number of useful SSR markers. Identification of additional polymorphic SSR markers will support genetics and breeding research in spinach. This study aimed to use the available genomic resources to mine and catalog a large number of polymorphic SSR markers. A search for SSR loci on six chromosome sequences of spinach line Sp75 using GMATA identified a total of 42,155 loci with repeat motifs of two to six nucleotides in the Sp75 reference genome. Whole-genome sequences (30x) of additional 21 accessions were aligned against the chromosome sequences of the reference genome and in silico genotyped using the HipSTR program by comparing and counting repeat numbers variation across the SSR loci among the accessions. The HipSTR program generated SSR genotype data were filtered for monomorphic and high missing loci, and a final set of the 5986 polymorphic SSR loci were identified. The polymorphic SSR loci were present at a density of 12.9 SSRs/Mb and were physically mapped. Out of 36 randomly selected SSR loci for validation, two failed to amplify, while the remaining were all polymorphic in a set of 48 spinach accessions from 34 countries. Genetic diversity analysis performed using the SSRs allele score data on the 48 spinach accessions showed three main population groups. This strategy to mine and develop polymorphic SSR markers by a comparative analysis of the genome sequences of multiple accessions and computational genotyping of the candidate SSR loci eliminates the need for laborious experimental screening. Our approach increased the efficiency of discovering a large set of novel polymorphic SSR markers, as demonstrated in this report.
基因组序列组装良好且测序成本降低,使得对多个作物的更多品系进行重测序成为可能,从而促进了简单重复序列(SSR)标记的快速发现和开发。尽管已经获得了自交菠菜品系 Sp75 的基因组序列,但之前的努力导致获得的有用 SSR 标记数量有限。鉴定更多多态性 SSR 标记将支持菠菜的遗传学和育种研究。本研究旨在利用现有基因组资源挖掘和编目大量多态性 SSR 标记。使用 GMATA 在 Sp75 品系的六条染色体序列上搜索 SSR 位点,共鉴定到在 Sp75 参考基因组中重复基序为二到六个核苷酸的 42155 个 SSR 位点。用 HipSTR 程序将另外 21 个品系的全基因组序列(30x)与参考基因组的染色体序列进行比对,并通过比较和计算 SSR 位点在品系之间的重复数量变化,在计算机上对这些品系进行基因型分析。HipSTR 程序生成的 SSR 基因型数据经过单态和高缺失位点的过滤,最终确定了 5986 个多态性 SSR 位点。这些多态性 SSR 位点的密度为 12.9 SSRs/Mb,并进行了物理作图。在 36 个随机选择的 SSR 位点中,有 2 个位点未能扩增,而其余 34 个国家的 48 个菠菜品系均表现出多态性。对 48 个菠菜品系的 SSRs 等位基因得分数据进行遗传多样性分析,结果显示有三个主要的群体。通过对多个品系的基因组序列进行比较分析和对候选 SSR 位点进行计算基因型分析,这种挖掘和开发多态性 SSR 标记的策略消除了繁琐的实验筛选的需要。我们的方法提高了发现大量新的多态性 SSR 标记的效率,正如本报告所展示的那样。