Cereal Research Centre, Agriculture and Agri-Food Canada, Winnipeg, Manitoba, Canada.
BMC Genomics. 2012 Dec 6;13:684. doi: 10.1186/1471-2164-13-684.
Flax (Linum usitatissimum L.) is a significant fibre and oilseed crop. Current flax molecular markers, including isozymes, RAPDs, AFLPs and SSRs are of limited use in the construction of high density linkage maps and for association mapping applications due to factors such as low reproducibility, intense labour requirements and/or limited numbers. We report here on the use of a reduced representation library strategy combined with next generation Illumina sequencing for rapid and large scale discovery of SNPs in eight flax genotypes. SNP discovery was performed through in silico analysis of the sequencing data against the whole genome shotgun sequence assembly of flax genotype CDC Bethune. Genotyping-by-sequencing of an F6-derived recombinant inbred line population provided validation of the SNPs.
Reduced representation libraries of eight flax genotypes were sequenced on the Illumina sequencing platform resulting in sequence coverage ranging from 4.33 to 15.64X (genome equivalents). Depending on the relatedness of the genotypes and the number and length of the reads, between 78% and 93% of the reads mapped onto the CDC Bethune whole genome shotgun sequence assembly. A total of 55,465 SNPs were discovered with the largest number of SNPs belonging to the genotypes with the highest mapping coverage percentage. Approximately 84% of the SNPs discovered were identified in a single genotype, 13% were shared between any two genotypes and the remaining 3% in three or more. Nearly a quarter of the SNPs were found in genic regions. A total of 4,706 out of 4,863 SNPs discovered in Macbeth were validated using genotyping-by-sequencing of 96 F6 individuals from a recombinant inbred line population derived from a cross between CDC Bethune and Macbeth, corresponding to a validation rate of 96.8%.
Next generation sequencing of reduced representation libraries was successfully implemented for genome-wide SNP discovery from flax. The genotyping-by-sequencing approach proved to be efficient for validation. The SNP resources generated in this work will assist in generating high density maps of flax and facilitate QTL discovery, marker-assisted selection, phylogenetic analyses, association mapping and anchoring of the whole genome shotgun sequence.
亚麻(Linum usitatissimum L.)是一种重要的纤维和油籽作物。当前的亚麻分子标记,包括同工酶、RAPD、AFLP 和 SSR,由于重复性低、劳动强度大以及/或者数量有限等因素,在高密度连锁图谱的构建和关联作图应用中应用有限。我们在此报告了一种利用简化基因组文库策略结合下一代 Illumina 测序技术,快速大规模发现 8 个亚麻基因型中的 SNPs 的方法。通过对测序数据进行从头计算分析,与亚麻基因型 CDC Bethune 的全基因组鸟枪法序列组装进行比较,发现了 SNP。通过对来自 F6 衍生重组自交系群体的测序数据进行基因分型,验证了 SNPs 的准确性。
对 8 个亚麻基因型的简化基因组文库进行了 Illumina 测序,序列覆盖率范围为 4.33 至 15.64X(基因组当量)。根据基因型的亲缘关系以及读取的数量和长度,78%至 93%的读取可以映射到 CDC Bethune 的全基因组鸟枪法序列组装上。共发现了 55465 个 SNPs,其中数量最多的 SNPs 属于映射覆盖率最高的基因型。大约 84%的 SNPs 仅在一个基因型中发现,13%的 SNPs 在任意两个基因型中共享,其余 3%在三个或更多基因型中共享。近四分之一的 SNPs 位于基因区域。在 Macbeth 中发现的 4863 个 SNPs 中,有 4706 个通过对来自 CDC Bethune 和 Macbeth 杂交衍生的重组自交系群体的 96 个 F6 个体进行基因分型测序进行了验证,验证率为 96.8%。
从亚麻中进行全基因组 SNP 发现的简化基因组文库的下一代测序成功实施。基因分型测序方法被证明是有效的验证方法。本研究生成的 SNP 资源将有助于生成亚麻的高密度图谱,并促进 QTL 发现、标记辅助选择、系统发育分析、关联作图和全基因组鸟枪法序列的定位。