Jamann Tiffany M, Sood Shilpa, Wisser Randall J, Holland James B
Department of Crop Sciences, University of Illinois, Urbana, IL, United States of America.
Monsanto Company, 700 Chesterfield Parkway West, Chesterfield, Missouri, United States of America.
PLoS One. 2017 Jan 3;12(1):e0168910. doi: 10.1371/journal.pone.0168910. eCollection 2017.
Despite the reduction in the price of sequencing, it remains expensive to sequence and assemble whole, complex genomes of multiple samples for population studies, particularly for large genomes like those of many crop species. Enrichment of target genome regions coupled with next generation sequencing is a cost-effective strategy to obtain sequence information for loci of interest across many individuals, providing a less expensive approach to evaluating sequence variation at the population scale. Here we evaluate amplicon-based enrichment coupled with semiconductor sequencing on a validation set consisting of three maize inbred lines, two hybrids and 19 landrace accessions. We report the use of a multiplexed panel of 319 PCR assays that target 20 candidate loci associated with photoperiod sensitivity in maize while requiring 25 ng or less of starting DNA per sample. Enriched regions had an average on-target sequence read depth of 105 with 98% of the sequence data mapping to the maize 'B73' reference and 80% of the reads mapping to the target interval. Sequence reads were aligned to B73 and 1,486 and 1,244 variants were called using SAMtools and GATK, respectively. Of the variants called by both SAMtools and GATK, 30% were not previously reported in maize. Due to the high sequence read depth, heterozygote genotypes could be called with at least 92.5% accuracy in hybrid materials using GATK. The genetic data are congruent with previous reports of high total genetic diversity and substantial population differentiation among maize landraces. In conclusion, semiconductor sequencing of highly multiplexed PCR reactions is a cost-effective strategy for resequencing targeted genomic loci in diverse maize materials.
尽管测序价格有所下降,但对多个样本进行全基因组测序和组装,尤其是对许多作物物种那样的大基因组进行测序和组装,用于群体研究仍然成本高昂。将目标基因组区域富集与新一代测序相结合,是一种经济高效的策略,可用于获取多个个体中感兴趣基因座的序列信息,为在群体规模上评估序列变异提供了一种成本较低的方法。在此,我们对由三个玉米自交系、两个杂交种和19个地方品种组成的验证集,评估了基于扩增子的富集与半导体测序相结合的方法。我们报告了使用一个包含319个PCR检测的多重面板,该面板靶向20个与玉米光周期敏感性相关的候选基因座,每个样本所需起始DNA量为25 ng或更少。富集区域的平均靶向序列读深度为105,98%的序列数据比对到玉米“B73”参考基因组,80%的 reads 比对到目标区间。序列 reads 与B73比对,分别使用SAMtools和GATK调用了1486个和1244个变异。在SAMtools和GATK都调用的变异中,30%以前未在玉米中报道过。由于序列读深度高,使用GATK在杂交材料中可以至少92.5%的准确率调用杂合子基因型。这些遗传数据与之前关于玉米地方品种中高总遗传多样性和显著群体分化的报道一致。总之,对高度多重PCR反应进行半导体测序是对不同玉米材料中靶向基因组位点进行重测序的一种经济高效的策略。