Department of Animal Science, Michigan State University, East Lansing, Michigan 48824-1225, USA.
Genetics. 2012 Apr;190(4):1491-501. doi: 10.1534/genetics.111.131540. Epub 2011 Nov 30.
Hierarchical mixed effects models have been demonstrated to be powerful for predicting genomic merit of livestock and plants, on the basis of high-density single-nucleotide polymorphism (SNP) marker panels, and their use is being increasingly advocated for genomic predictions in human health. Two particularly popular approaches, labeled BayesA and BayesB, are based on specifying all SNP-associated effects to be independent of each other. BayesB extends BayesA by allowing a large proportion of SNP markers to be associated with null effects. We further extend these two models to specify SNP effects as being spatially correlated due to the chromosomally proximal effects of causal variants. These two models, that we respectively dub as ante-BayesA and ante-BayesB, are based on a first-order nonstationary antedependence specification between SNP effects. In a simulation study involving 20 replicate data sets, each analyzed at six different SNP marker densities with average LD levels ranging from r(2) = 0.15 to 0.31, the antedependence methods had significantly (P < 0.01) higher accuracies than their corresponding classical counterparts at higher LD levels (r(2) > 0. 24) with differences exceeding 3%. A cross-validation study was also conducted on the heterogeneous stock mice data resource (http://mus.well.ox.ac.uk/mouse/HS/) using 6-week body weights as the phenotype. The antedependence methods increased cross-validation prediction accuracies by up to 3.6% compared to their classical counterparts (P < 0.001). Finally, we applied our method to other benchmark data sets and demonstrated that the antedependence methods were more accurate than their classical counterparts for genomic predictions, even for individuals several generations beyond the training data.
层次混合效应模型已被证明在基于高密度单核苷酸多态性 (SNP) 标记面板预测牲畜和植物的基因组优势方面非常有效,并且它们在人类健康的基因组预测中的使用正日益受到推崇。两种特别流行的方法,标记为 BayesA 和 BayesB,基于指定所有与 SNP 相关的效应彼此独立。BayesB 通过允许大量 SNP 标记与无效效应相关来扩展 BayesA。我们进一步扩展这两个模型,指定 SNP 效应由于因果变异的染色体近端效应而具有空间相关性。这两个模型,我们分别称为 ante-BayesA 和 ante-BayesB,基于 SNP 效应之间的一阶非平稳先行依赖规范。在一项涉及 20 个重复数据集的模拟研究中,每个数据集在 6 种不同的 SNP 标记密度下进行分析,平均 LD 水平从 r(2) = 0.15 到 0.31 不等,在较高的 LD 水平 (r(2) > 0.24) 下,先行依赖方法的准确性显著(P < 0.01)高于其相应的经典方法,差异超过 3%。还在杂种鼠数据资源(http://mus.well.ox.ac.uk/mouse/HS/)上进行了交叉验证研究,使用 6 周体重作为表型。与经典方法相比,先行依赖方法将交叉验证预测准确性提高了高达 3.6%(P < 0.001)。最后,我们将我们的方法应用于其他基准数据集,并表明,即使对于训练数据之外的几代个体,先行依赖方法在基因组预测方面也比经典方法更准确。