Department of Animal Science, Iowa State University, Ames, IA 50011, USA.
Genet Sel Evol. 2009 Dec 31;41(1):55. doi: 10.1186/1297-9686-41-55.
Genomic prediction of breeding values involves a so-called training analysis that predicts the influence of small genomic regions by regression of observed information on marker genotypes for a given population of individuals. Available observations may take the form of individual phenotypes, repeated observations, records on close family members such as progeny, estimated breeding values (EBV) or their deregressed counterparts from genetic evaluations. The literature indicates that researchers are inconsistent in their approach to using EBV or deregressed data, and as to using the appropriate methods for weighting some data sources to account for heterogeneous variance.
A logical approach to using information for genomic prediction is introduced, which demonstrates the appropriate weights for analyzing observations with heterogeneous variance and explains the need for and the manner in which EBV should have parent average effects removed, be deregressed and weighted.
An appropriate deregression for genomic regression analyses is EBV/r2 where EBV excludes parent information and r2 is the reliability of that EBV. The appropriate weights for deregressed breeding values are neither the reliability nor the prediction error variance, two alternatives that have been used in published studies, but the ratio (1 - h2)/[(c + (1 - r2)/r2)h2] where c > 0 is the fraction of genetic variance not explained by markers.
Phenotypic information on some individuals and deregressed data on others can be combined in genomic analyses using appropriate weighting.
基因组预测的选育值涉及所谓的训练分析,该分析通过回归观察到的信息对标记基因型进行预测,从而预测小的基因组区域对个体的给定群体的影响。可用的观测值可以是个体表型、重复观测值、近亲属(如后代)的记录、估计的育种值 (EBV) 或遗传评估中其去回归的对应值。文献表明,研究人员在使用 EBV 或去回归数据的方法上不一致,并且在使用适当的方法对一些数据源进行加权以考虑异方差方面也不一致。
本文介绍了一种逻辑方法来使用信息进行基因组预测,该方法演示了分析具有异方差的观测值的适当权重,并解释了去除 EBV 中的亲本平均效应、去回归和加权的必要性和方法。
基因组回归分析的适当去回归是 EBV/r2,其中 EBV 排除了亲本信息,r2 是 EBV 的可靠性。去回归选育值的适当权重既不是可靠性也不是预测误差方差,这是两个已在已发表的研究中使用的替代方案,而是(1-h2)/[(c + (1-r2)/r2)h2],其中 c > 0 是标记无法解释的遗传方差的分数。
使用适当的加权,可以在基因组分析中组合某些个体的表型信息和其他个体的去回归数据。