Pan Jin, Wang Baosheng, Pei Zhi-Yong, Zhao Wei, Gao Jie, Mao Jian-Feng, Wang Xiao-Ru
Department of Ecology and Environmental Science, Umeå University, Umeå, SE-90187, Sweden.
State Key Laboratory of Systematic and Evolutionary Botany, Institute of Botany, Chinese Academy of Sciences, Beijing, 100093, China.
Mol Ecol Resour. 2015 Jul;15(4):711-22. doi: 10.1111/1755-0998.12342. Epub 2014 Nov 20.
Flexibility and low cost make genotyping-by-sequencing (GBS) an ideal tool for population genomic studies of nonmodel species. However, to utilize the potential of the method fully, many parameters affecting library quality and single nucleotide polymorphism (SNP) discovery require optimization, especially for conifer genomes with a high repetitive DNA content. In this study, we explored strategies for effective GBS analysis in pine species. We constructed GBS libraries using HpaII, PstI and EcoRI-MseI digestions with different multiplexing levels and examined the effect of restriction enzymes on library complexity and the impact of sequencing depth and size selection of restriction fragments on sequence coverage bias. We tested and compared UNEAK, Stacks and GATK pipelines for the GBS data, and then developed a reference-free SNP calling strategy for haploid pine genomes. Our GBS procedure proved to be effective in SNP discovery, producing 7000-11 000 and 14 751 SNPs within and among three pine species, respectively, from a PstI library. This investigation provides guidance for the design and analysis of GBS experiments, particularly for organisms for which genomic information is lacking.
灵活性和低成本使简化基因组测序(GBS)成为非模式物种群体基因组研究的理想工具。然而,为了充分发挥该方法的潜力,许多影响文库质量和单核苷酸多态性(SNP)发现的参数需要优化,特别是对于具有高重复DNA含量的针叶树基因组。在本研究中,我们探索了在松树物种中进行有效GBS分析的策略。我们使用不同多重水平的HpaII、PstI和EcoRI-MseI酶切构建GBS文库,并研究了限制酶对文库复杂性的影响,以及测序深度和限制片段大小选择对序列覆盖偏差的影响。我们对GBS数据测试并比较了UNEAK、Stacks和GATK流程,然后为单倍体松树基因组开发了一种无参考SNP calling策略。我们的GBS程序在SNP发现方面被证明是有效的,从一个PstI文库中,分别在三种松树物种内部和之间产生了7000 - 11000个和14751个SNP。这项研究为GBS实验的设计和分析提供了指导,特别是对于缺乏基因组信息的生物。