Song Jian, Yang Xiping, Resende Marcio F R, Neves Leandro G, Todd James, Zhang Jisen, Comstock Jack C, Wang Jianping
Agronomy Department, University of FloridaGainesville, FL, USA; College of Life Sciences, Dezhou UniversityDezhou, China.
Agronomy Department, University of Florida Gainesville, FL, USA.
Front Plant Sci. 2016 Jun 8;7:804. doi: 10.3389/fpls.2016.00804. eCollection 2016.
Sugarcane (Saccharum spp.) is an important sugar and biofuel crop with high polyploid and complex genomes. The Saccharum complex, comprised of Saccharum genus and a few related genera, are important genetic resources for sugarcane breeding. A large amount of natural variation exists within the Saccharum complex. Though understanding their allelic variation has been challenging, it is critical to dissect allelic structure and to identify the alleles controlling important traits in sugarcane. To characterize natural variations in Saccharum complex, a target enrichment sequencing approach was used to assay 12 representative germplasm accessions. In total, 55,946 highly efficient probes were designed based on the sorghum genome and sugarcane unigene set targeting a total of 6 Mb of the sugarcane genome. A pipeline specifically tailored for polyploid sequence variants and genotype calling was established. BWA-mem and sorghum genome approved to be an acceptable aligner and reference for sugarcane target enrichment sequence analysis, respectively. Genetic variations including 1,166,066 non-redundant SNPs, 150,421 InDels, 919 gene copy number variations, and 1,257 gene presence/absence variations were detected. SNPs from three different callers (Samtools, Freebayes, and GATK) were compared and the validation rates were nearly 90%. Based on the SNP loci of each accession and their ploidy levels, 999,258 single dosage SNPs were identified and most loci were estimated as largely homozygotes. An average of 34,397 haplotype blocks for each accession was inferred. The highest divergence time among the Saccharum spp. was estimated as 1.2 million years ago (MYA). Saccharum spp. diverged from Erianthus and Sorghum approximately 5 and 6 MYA, respectively. The target enrichment sequencing approach provided an effective way to discover and catalog natural allelic variation in highly polyploid or heterozygous genomes.
甘蔗(Saccharum spp.)是一种重要的糖料和生物燃料作物,具有高度多倍体和复杂的基因组。甘蔗复合体由甘蔗属和一些相关属组成,是甘蔗育种的重要遗传资源。甘蔗复合体内部存在大量自然变异。尽管了解其等位基因变异具有挑战性,但剖析等位基因结构并鉴定控制甘蔗重要性状的等位基因至关重要。为了表征甘蔗复合体的自然变异,采用了目标富集测序方法对12份代表性种质材料进行分析。基于高粱基因组和甘蔗单基因集,总共设计了55,946个高效探针,靶向甘蔗基因组的6 Mb区域。建立了专门针对多倍体序列变异和基因型分型的流程。BWA-mem和高粱基因组分别被证明是甘蔗目标富集序列分析可接受的比对工具和参考序列。检测到包括1,166,066个非冗余单核苷酸多态性(SNP)、150,421个插入缺失(InDel)、919个基因拷贝数变异和1,257个基因存在/缺失变异在内的遗传变异。比较了来自三种不同分型工具(Samtools、Freebayes和GATK)的SNP,验证率接近90%。根据每个材料的SNP位点及其倍性水平,鉴定出999,258个单剂量SNP,且大多数位点估计为高度纯合子。推断每个材料平均有34,397个单倍型块。甘蔗属物种之间的最大分歧时间估计为120万年前。甘蔗属分别在约500万年前和600万年前与蔗茅属和高粱属分化。目标富集测序方法为在高度多倍体或杂合基因组中发现和编目自然等位基因变异提供了一种有效方法。