Department of General Biology, Federal University of Viçosa, 36570-900, Viçosa, MG, Brazil.
Dow AgroSciences Seeds and Biotechnology Brazil Ltda, 38490-000, Indianópolis, MG, Brazil.
Heredity (Edinb). 2018 Apr;120(4):283-295. doi: 10.1038/s41437-017-0027-0. Epub 2017 Nov 28.
An important application of genomic selection in plant breeding is predicting untested single crosses (SCs). Most investigations on the prediction efficiency were based on tested SCs using cross-validation. The main objective was to assess the prediction efficiency by correlating the predicted and true genotypic values of untested SCs (accuracy) and measuring the efficacy of identification of the best 300 untested SCs (coincidence) using simulated data. We assumed 10,000 SNPs, 400 QTLs, two groups of 70 selected DH lines, and 4900 SCs. The heritabilities for the assessed SCs were 30, 60, and 100%. The scenarios included three sampling processes of DH lines, two sampling processes of SCs for testing, two SNP densities, DH lines from distinct and the same populations, DH lines from populations with lower LD, two genetic models, three statistical models, and three statistical approaches. We derived a model for genomic prediction based on SNP average effects of substitution and dominance deviations. The prediction accuracy is not affected by the linkage phase. The prediction of untested SCs is very efficient. The accuracies and coincidences ranged from ~0.8 and 0.5 at low heritability to 0.9 and 0.7 at high heritability, respectively. We also highlight the relevance of the overall LD and demonstrate that efficient prediction of untested SCs can be achieved for crops that show no heterotic pattern, for reduced training set size (10%), for SNP density of 1 cM, and for distinct sampling processes of DH lines based on random choice of the SCs for testing.
基因组选择在植物育种中的一个重要应用是预测未经测试的单交(SCs)。大多数关于预测效率的研究都是基于使用交叉验证的已测试 SC 进行的。主要目的是通过关联未经测试的 SC 的预测和真实基因型值(准确性)来评估预测效率,并使用模拟数据测量识别最佳 300 个未经测试的 SC 的效率(一致性)。我们假设有 10000 个 SNPs、400 个 QTL、两组 70 个已选择的 DH 系和 4900 个 SC。评估的 SC 的遗传力分别为 30%、60%和 100%。场景包括 DH 系的三种抽样过程、两种用于测试的 SC 抽样过程、两种 SNP 密度、来自不同和相同群体的 DH 系、来自 LD 较低的群体的 DH 系、两种遗传模型、三种统计模型和三种统计方法。我们基于 SNP 平均替代和显性偏差效应推导了一种基因组预测模型。预测准确性不受连锁相位的影响。未经测试的 SC 的预测非常有效。准确性和一致性范围从低遗传力时的约 0.8 和 0.5 到高遗传力时的 0.9 和 0.7。我们还强调了整体 LD 的相关性,并表明对于没有杂种优势模式的作物,可以通过减少训练集大小(10%)、1cM 的 SNP 密度和基于随机选择 SC 进行测试的 DH 系的不同抽样过程来实现未经测试的 SC 的有效预测。