Faculty of Veterinary and Agricultural Sciences, University of Melbourne, Parkville, 3052, Australia.
Institute of Molecular Biosciences, University of Queensland, St. Lucia, 4072, Australia.
Genet Sel Evol. 2018 Mar 24;50(1):10. doi: 10.1186/s12711-018-0377-y.
Genomic prediction and quantitative trait loci (QTL) mapping typically analyze one trait at a time but this may ignore the possibility that one polymorphism affects multiple traits. The aim of this study was to develop a multivariate Bayesian approach that could be used for simultaneously elucidating genetic architecture, QTL mapping, and genomic prediction. Our approach uses information from multiple traits to divide markers into 'unassociated' (no association with any trait) and 'associated' (associated with one or more traits). The effect of associated markers is estimated independently for each trait to avoid the assumption that QTL effects follow a multi-variate normal distribution.
Using simulated data, our multivariate method (BayesMV) detected a larger number of true QTL (with a posterior probability > 0.9) and increased the accuracy of genomic prediction compared to an equivalent univariate method (BayesR). With real data, accuracies of genomic prediction in validation sets for milk yield traits with high-density genotypes were approximately equal to those from equivalent single-trait methods. BayesMV tended to select a similar number of single nucleotide polymorphisms (SNPs) per trait for genomic prediction compared to BayesR (i.e. those with non-zero effects), but BayesR selected different sets of SNPs for each trait, whereas BayesMV selected a common set of SNPs across traits. Despite these two dramatically different estimates of genetic architecture (i.e. different SNPs affecting each trait vs. pleiotropic SNPs), both models indicated that 3000 to 4000 SNPs are associated with a trait. The BayesMV approach may be advantageous when the aim is to develop a low-density SNP chip that works well for a number of traits. SNPs for milk yield traits identified by BayesMV and BayesR were also found to be associated with detailed milk composition.
The BayesMV method simultaneously estimates the proportion of SNPs that are associated with a combination of traits. When applied to milk production traits, most of the identified SNPs were associated with all three traits (milk, fat and protein yield). BayesMV aims at exploiting pleiotropic QTL and selects a small number of SNPs that could be used to predict multiple traits.
基因组预测和数量性状基因座(QTL)作图通常一次分析一个性状,但这可能忽略了一个多态性影响多个性状的可能性。本研究的目的是开发一种多元贝叶斯方法,可用于同时阐明遗传结构、QTL 作图和基因组预测。我们的方法利用多个性状的信息将标记分为“不相关”(与任何性状都没有关联)和“相关”(与一个或多个性状相关)。相关标记的效应是为每个性状独立估计的,以避免 QTL 效应遵循多变量正态分布的假设。
使用模拟数据,我们的多元方法(BayesMV)检测到更多的真正 QTL(后验概率>0.9),并提高了基因组预测的准确性,与等效的单变量方法(BayesR)相比。使用真实数据,高密度基因型下产奶性状验证集的基因组预测准确性与等效单性状方法相当。BayesMV 倾向于为基因组预测选择每个性状的数量相似的单核苷酸多态性(SNP),与 BayesR 相比(即具有非零效应的 SNP),但 BayesR 为每个性状选择了不同的 SNP 集,而 BayesMV 为所有性状选择了共同的 SNP 集。尽管这两种遗传结构的估计有很大的不同(即影响每个性状的不同 SNP 与多效 SNP),但两种模型都表明 3000 到 4000 个 SNP 与一个性状有关。当目标是开发适用于多种性状的低密度 SNP 芯片时,BayesMV 方法可能具有优势。BayesMV 和 BayesR 鉴定的产奶性状 SNP 也与详细的牛奶成分有关。
BayesMV 方法同时估计与多个性状组合相关的 SNP 比例。当应用于产奶性状时,大多数鉴定的 SNP 与所有三个性状(牛奶、脂肪和蛋白质产量)都有关联。BayesMV 的目的是利用多效性 QTL,并选择少量可用于预测多个性状的 SNP。