Cericola Fabio, Jahoor Ahmed, Orabi Jihad, Andersen Jeppe R, Janss Luc L, Jensen Just
Center for Quantitative Genetics and Genomics, Department of Molecular Biology and Genetics, Aarhus University, Tjele, Denmark.
Department of Plant Breeding, The Swedish University of Agricultural Sciences, Uppsala, Sweden.
PLoS One. 2017 Jan 12;12(1):e0169606. doi: 10.1371/journal.pone.0169606. eCollection 2017.
Wheat breeding programs generate a large amount of variation which cannot be completely explored because of limited phenotyping throughput. Genomic prediction (GP) has been proposed as a new tool which provides breeding values estimations without the need of phenotyping all the material produced but only a subset of it named training population (TP). However, genotyping of all the accessions under analysis is needed and, therefore, optimizing TP dimension and genotyping strategy is pivotal to implement GP in commercial breeding schemes. Here, we explored the optimum TP size and we integrated pedigree records and genome wide association studies (GWAS) results to optimize the genotyping strategy. A total of 988 advanced wheat breeding lines were genotyped with the Illumina 15K SNPs wheat chip and phenotyped across several years and locations for yield, lodging, and starch content. Cross-validation using the largest possible TP size and all the SNPs available after editing (~11k), yielded predictive abilities (rGP) ranging between 0.5-0.6. In order to explore the Training population size, rGP were computed using progressively smaller TP. These exercises showed that TP of around 700 lines were enough to yield the highest observed rGP. Moreover, rGP were calculated by randomly reducing the SNPs number. This showed that around 1K markers were enough to reach the highest observed rGP. GWAS was used to identify markers associated with the traits analyzed. A GWAS-based selection of SNPs resulted in increased rGP when compared with random selection and few hundreds SNPs were sufficient to obtain the highest observed rGP. For each of these scenarios, advantages of adding the pedigree information were shown. Our results indicate that moderate TP sizes were enough to yield high rGP and that pedigree information and GWAS results can be used to greatly optimize the genotyping strategy.
小麦育种计划产生了大量变异,由于表型分析通量有限,这些变异无法得到充分研究。基因组预测(GP)作为一种新工具被提出,它可以在无需对所有育成材料进行表型分析的情况下估计育种值,而只需要对其中一部分称为训练群体(TP)的材料进行表型分析。然而,需要对所有分析的种质进行基因分型,因此,优化训练群体规模和基因分型策略对于在商业育种方案中实施基因组预测至关重要。在此,我们探索了最佳训练群体规模,并整合系谱记录和全基因组关联研究(GWAS)结果以优化基因分型策略。总共988个小麦高级育种品系使用Illumina 15K SNPs小麦芯片进行基因分型,并在多年和多个地点对产量、倒伏性和淀粉含量进行表型分析。使用尽可能大的训练群体规模和编辑后可用的所有SNP(约11k)进行交叉验证,预测能力(rGP)在0.5 - 0.6之间。为了探索训练群体规模,使用逐渐减小的训练群体计算rGP。这些试验表明,约700个品系的训练群体足以产生最高的rGP观测值。此外,通过随机减少SNP数量来计算rGP。结果表明,约1K个标记足以达到最高的rGP观测值。利用GWAS鉴定与分析性状相关的标记。与随机选择相比,基于GWAS的SNP选择导致rGP增加,几百个SNP就足以获得最高的rGP观测值。对于上述每种情况,均显示了添加系谱信息的优势。我们的结果表明,适度的训练群体规模足以产生高rGP,并且系谱信息和GWAS结果可用于极大地优化基因分型策略。