Department of Animal Sciences, Department of Biostatistics and Medical Informatics, and Department of Dairy Science, University of Wisconsin, Madison, Wisconsin 53706, USA.
Genetics. 2013 Jul;194(3):573-96. doi: 10.1534/genetics.113.151753. Epub 2013 May 1.
Whole-genome enabled prediction of complex traits has received enormous attention in animal and plant breeding and is making inroads into human and even Drosophila genetics. The term "Bayesian alphabet" denotes a growing number of letters of the alphabet used to denote various Bayesian linear regressions that differ in the priors adopted, while sharing the same sampling model. We explore the role of the prior distribution in whole-genome regression models for dissecting complex traits in what is now a standard situation with genomic data where the number of unknown parameters (p) typically exceeds sample size (n). Members of the alphabet aim to confront this overparameterization in various manners, but it is shown here that the prior is always influential, unless n ≫ p. This happens because parameters are not likelihood identified, so Bayesian learning is imperfect. Since inferences are not devoid of the influence of the prior, claims about genetic architecture from these methods should be taken with caution. However, all such procedures may deliver reasonable predictions of complex traits, provided that some parameters ("tuning knobs") are assessed via a properly conducted cross-validation. It is concluded that members of the alphabet have a room in whole-genome prediction of phenotypes, but have somewhat doubtful inferential value, at least when sample size is such that n ≪ p.
全基因组预测复杂性状在动植物育种中受到了极大的关注,并且正在深入到人类甚至果蝇遗传学领域。“贝叶斯字母表”一词表示越来越多的字母用于表示各种不同的贝叶斯线性回归,这些回归在采用的先验分布上有所不同,而共享相同的抽样模型。我们探讨了先验分布在全基因组回归模型中的作用,以剖析复杂性状,这在现在的基因组数据情况下是一种标准情况,其中未知参数(p)的数量通常超过样本量(n)。字母表中的成员旨在以各种方式应对这种超参数化,但这里表明,除非 n ≫ p,否则先验总是有影响的。这是因为参数不是似然确定的,因此贝叶斯学习是不完美的。由于推断不受先验的影响,因此应该谨慎对待这些方法得出的关于遗传结构的结论。然而,只要某些参数(“调谐旋钮”)通过适当的交叉验证进行评估,所有这些方法都可以提供复杂性状的合理预测。因此,字母表中的成员在全基因组预测表型方面有一定的空间,但至少在样本量为 n ≪ p 的情况下,它们的推断价值有些可疑。