Unité Mixte de Recherche (UMR) de Génétique Végétale, Institut National de la Recherche Agronomique (INRA), Université Paris-Sud, Centre National de la Recherche Scientifique (CNRS), 91190 Gif-sur-Yvette, France.
Genetics. 2012 Oct;192(2):715-28. doi: 10.1534/genetics.112.141473. Epub 2012 Aug 3.
Genomic selection refers to the use of genotypic information for predicting breeding values of selection candidates. A prediction formula is calibrated with the genotypes and phenotypes of reference individuals constituting the calibration set. The size and the composition of this set are essential parameters affecting the prediction reliabilities. The objective of this study was to maximize reliabilities by optimizing the calibration set. Different criteria based on the diversity or on the prediction error variance (PEV) derived from the realized additive relationship matrix-best linear unbiased predictions model (RA-BLUP) were used to select the reference individuals. For the latter, we considered the mean of the PEV of the contrasts between each selection candidate and the mean of the population (PEVmean) and the mean of the expected reliabilities of the same contrasts (CDmean). These criteria were tested with phenotypic data collected on two diversity panels of maize (Zea mays L.) genotyped with a 50k SNPs array. In the two panels, samples chosen based on CDmean gave higher reliabilities than random samples for various calibration set sizes. CDmean also appeared superior to PEVmean, which can be explained by the fact that it takes into account the reduction of variance due to the relatedness between individuals. Selected samples were close to optimality for a wide range of trait heritabilities, which suggests that the strategy presented here can efficiently sample subsets in panels of inbred lines. A script to optimize reference samples based on CDmean is available on request.
基因组选择是指利用基因型信息预测选择对象的育种值。预测公式是通过对构成校准集的参考个体的基因型和表型进行校准得到的。这个集合的大小和组成是影响预测可靠性的重要参数。本研究的目的是通过优化校准集来最大化可靠性。使用基于多样性或基于从实现的加性关系矩阵最佳线性无偏预测模型(RA-BLUP)得出的预测误差方差(PEV)的不同标准来选择参考个体。对于后者,我们考虑了每个选择对象与群体平均值之间的 PEV 平均值(PEVmean)和相同对比的预期可靠性平均值(CDmean)。这些标准使用在玉米(Zea mays L.)两个多样性面板上收集的表型数据进行了测试,这些面板使用 50k SNP 阵列进行了基因分型。在两个面板中,基于 CDmean 选择的样本在各种校准集大小下的可靠性均高于随机样本。CDmean 也优于 PEVmean,这可以解释为它考虑了由于个体之间的相关性而导致的方差减小。选择的样本在广泛的性状遗传力范围内接近最优,这表明这里提出的策略可以有效地从自交系面板中抽样子集。根据 CDmean 优化参考样本的脚本可应要求提供。