Müller Bárbara S F, Neves Leandro G, de Almeida Filho Janeo E, Resende Márcio F R, Muñoz Patricio R, Dos Santos Paulo E T, Filho Estefano Paludzyszyn, Kirst Matias, Grattapaglia Dario
Cell Biology Department, Molecular Biology Program, Biological Sciences Institute, University of Brasília, Campus Darcy Ribeiro, Brasília, DF, 70910-900, Brazil.
EMBRAPA Genetic Resources and Biotechnology, Estação Parque Biológico, Brasília, DF, 70770-910, Brazil.
BMC Genomics. 2017 Jul 11;18(1):524. doi: 10.1186/s12864-017-3920-2.
The advent of high-throughput genotyping technologies coupled to genomic prediction methods established a new paradigm to integrate genomics and breeding. We carried out whole-genome prediction and contrasted it to a genome-wide association study (GWAS) for growth traits in breeding populations of Eucalyptus benthamii (n =505) and Eucalyptus pellita (n =732). Both species are of increasing commercial interest for the development of germplasm adapted to environmental stresses.
Predictive ability reached 0.16 in E. benthamii and 0.44 in E. pellita for diameter growth. Predictive abilities using either Genomic BLUP or different Bayesian methods were similar, suggesting that growth adequately fits the infinitesimal model. Genomic prediction models using ~5000-10,000 SNPs provided predictive abilities equivalent to using all 13,787 and 19,506 SNPs genotyped in the E. benthamii and E. pellita populations, respectively. No difference was detected in predictive ability when different sets of SNPs were utilized, based on position (equidistantly genome-wide, inside genes, linkage disequilibrium pruned or on single chromosomes), as long as the total number of SNPs used was above ~5000. Predictive abilities obtained by removing relatedness between training and validation sets fell near zero for E. benthamii and were halved for E. pellita. These results corroborate the current view that relatedness is the main driver of genomic prediction, although some short-range historical linkage disequilibrium (LD) was likely captured for E. pellita. A GWAS identified only one significant association for volume growth in E. pellita, illustrating the fact that while genome-wide regression is able to account for large proportions of the heritability, very little or none of it is captured into significant associations using GWAS in breeding populations of the size evaluated in this study.
This study provides further experimental data supporting positive prospects of using genome-wide data to capture large proportions of trait heritability and predict growth traits in trees with accuracies equal or better than those attainable by phenotypic selection. Additionally, our results document the superiority of the whole-genome regression approach in accounting for large proportions of the heritability of complex traits such as growth in contrast to the limited value of the local GWAS approach toward breeding applications in forest trees.
高通量基因分型技术与基因组预测方法的出现,建立了一种整合基因组学与育种的新范式。我们开展了全基因组预测,并将其与基因组关联研究(GWAS)进行对比,以研究本氏桉(n = 505)和粗皮桉(n = 732)育种群体的生长性状。这两个树种对于培育适应环境胁迫的种质资源而言,商业价值日益增加。
本氏桉直径生长的预测能力达到0.16,粗皮桉为0.44。使用基因组最佳线性无偏预测法(Genomic BLUP)或不同贝叶斯方法的预测能力相似,这表明生长性状充分符合微效多基因模型。使用约5000 - 10000个单核苷酸多态性(SNP)构建的基因组预测模型,其预测能力分别等同于在本氏桉群体中使用全部13787个SNP以及在粗皮桉群体中使用全部19506个SNP的情况。只要所使用的SNP总数超过约5000个,基于位置(全基因组等距分布、基因内部、连锁不平衡修剪或单条染色体上)使用不同SNP集合时,预测能力未检测到差异。通过消除训练集与验证集之间的亲缘关系所获得的预测能力,在本氏桉中降至接近零,在粗皮桉中减半。这些结果证实了当前的观点,即亲缘关系是基因组预测的主要驱动因素,尽管粗皮桉可能捕获了一些短程历史连锁不平衡(LD)。一项GWAS仅在粗皮桉中鉴定出一个与材积生长显著相关的位点,这表明尽管全基因组回归能够解释很大比例的遗传力,但在本研究评估规模的育种群体中,使用GWAS几乎无法捕获或根本无法捕获到显著相关的位点。
本研究提供了进一步的实验数据,支持利用全基因组数据捕获大部分性状遗传力并预测树木生长性状的积极前景,其准确性等同于或优于表型选择所能达到的水平。此外,我们的结果证明了全基因组回归方法在解释复杂性状(如生长)的大部分遗传力方面的优越性,与之形成对比的是,局部GWAS方法在林木育种应用中的价值有限。