Muraya Moses M, Schmutzer Thomas, Ulpinnis Chris, Scholz Uwe, Altmann Thomas
Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, Corrensstraße 3, D-06466, Stadt Seeland, Germany; Department of Plant Science, Chuka University, P.O. Box, 109-60400, Chuka, Kenya.
Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, Corrensstraße 3, D-06466, Stadt Seeland, Germany.
PLoS One. 2015 Jul 7;10(7):e0132120. doi: 10.1371/journal.pone.0132120. eCollection 2015.
A major goal of maize genomic research is to identify sequence polymorphisms responsible for phenotypic variation in traits of economic importance. Large-scale detection of sequence variation is critical for linking genes, or genomic regions, to phenotypes. However, due to its size and complexity, it remains expensive to generate whole genome sequences of sufficient coverage for divergent maize lines, even with access to next generation sequencing (NGS) technology. Because methods involving reduction of genome complexity, such as genotyping-by-sequencing (GBS), assess only a limited fraction of sequence variation, targeted sequencing of selected genomic loci offers an attractive alternative. We therefore designed a sequence capture assay to target 29 Mb genomic regions and surveyed a total of 4,648 genes possibly affecting biomass production in 21 diverse inbred maize lines (7 flints, 14 dents). Captured and enriched genomic DNA was sequenced using the 454 NGS platform to 19.6-fold average depth coverage, and a broad evaluation of read alignment and variant calling methods was performed to select optimal procedures for variant discovery. Sequence alignment with the B73 reference and de novo assembly identified 383,145 putative single nucleotide polymorphisms (SNPs), of which 42,685 were non-synonymous alterations and 7,139 caused frameshifts. Presence/absence variation (PAV) of genes was also detected. We found that substantial sequence variation exists among genomic regions targeted in this study, which was particularly evident within coding regions. This diversification has the potential to broaden functional diversity and generate phenotypic variation that may lead to new adaptations and the modification of important agronomic traits. Further, annotated SNPs identified here will serve as useful genetic tools and as candidates in searches for phenotype-altering DNA variation. In summary, we demonstrated that sequencing of captured DNA is a powerful approach for variant discovery in maize genes.
玉米基因组研究的一个主要目标是识别导致具有经济重要性的性状发生表型变异的序列多态性。大规模检测序列变异对于将基因或基因组区域与表型联系起来至关重要。然而,由于其大小和复杂性,即使能够使用下一代测序(NGS)技术,为不同的玉米品系生成具有足够覆盖度的全基因组序列仍然成本高昂。由于涉及降低基因组复杂性的方法,如简化基因组测序(GBS),仅评估了有限部分的序列变异,因此对选定基因组位点进行靶向测序提供了一种有吸引力的替代方法。因此,我们设计了一种序列捕获分析方法,以靶向29 Mb的基因组区域,并对21个不同的玉米自交系(7个硬粒型、14个马齿型)中总共4648个可能影响生物量生产的基因进行了调查。使用454 NGS平台对捕获并富集的基因组DNA进行测序,平均深度覆盖达到19.6倍,并对读段比对和变异检测方法进行了广泛评估,以选择用于变异发现的最佳程序。与B73参考序列进行序列比对并进行从头组装,鉴定出383,145个推定的单核苷酸多态性(SNP),其中42,685个是非同义改变,7139个导致移码。还检测到了基因的存在/缺失变异(PAV)。我们发现,本研究中靶向的基因组区域之间存在大量序列变异,这在编码区域内尤为明显。这种多样化有可能拓宽功能多样性并产生表型变异,从而可能导致新的适应性变化和重要农艺性状的改变。此外,这里鉴定出的注释SNP将作为有用的遗传工具,并作为寻找改变表型的DNA变异的候选对象。总之,我们证明了对捕获的DNA进行测序是在玉米基因中发现变异的一种强大方法。