Federal Institute of Maranhão - Campus São João dos Patos, São João dos Patos, Maranhão, Brasil.
Federal Institute of the Triângulo Mineiro - Campus Uberaba, Uberaba, Minas Gerais, Brasil.
PLoS One. 2019 Oct 23;14(10):e0222699. doi: 10.1371/journal.pone.0222699. eCollection 2019.
The development of sequencing technologies has enabled the discovery of markers that are abundantly distributed over the whole genome. Knowledge about the marker locations in reference genomes provides further insights in the search for causal regions and the prediction of genomic values. The present study proposes a Bayesian functional approach for incorporating the marker locations into genomic analysis using stochastic methods to search causal regions and predict genotypic values. For this, three scenarios were analyzed: F2 population with 300 individuals and three different heritability levels (0.2, 0.5, and 0.8), along with 12,150 SNP markers that were distributed through ten linkage groups; F∞ populations with 320 individuals and three different heritability levels (0.2, 0.5, and 0.8), along with 10,020 SNP markers that were distributed through ten linkage groups; and data related to Eucalyptus spp. to measure the model performance in a real LD setting, with 611 individuals whose phenotypes were simulated from QTLs distributed through a panel of 36,812 SNPs with known positions. The performance of the proposed method was compared with those of other genome selection models, namely, RR-BLUP, Bayes B and Bayesian Lasso. The Bayesian functional model presented higher or similar predictive ability when compared with those classical regressions methods in simulated and real scenarios on different LD structures. In general, the Bayesian functional model also achieved higher computational efficiency, using 12 SNPs per MCMC round. The model was efficient in the identification of causal regions and showed high flexibility of analysis, as it is easily adaptable to any genomic selection model.
测序技术的发展使得能够发现大量分布在整个基因组上的标记。了解参考基因组中标记的位置,为寻找因果区域和预测基因组值提供了进一步的见解。本研究提出了一种贝叶斯功能方法,该方法使用随机方法将标记位置纳入基因组分析,以搜索因果区域并预测基因型值。为此,分析了三种情况:F2 群体有 300 个个体和三种不同的遗传力水平(0.2、0.5 和 0.8),以及分布在十个连锁群中的 12150 个 SNP 标记;F∞群体有 320 个个体和三种不同的遗传力水平(0.2、0.5 和 0.8),以及分布在十个连锁群中的 10020 个 SNP 标记;以及与桉树属有关的数据,以衡量模型在真实 LD 环境下的性能,有 611 个个体的表型是从分布在 36812 个 SNP 面板中的 QTLs 模拟出来的,这些 SNP 具有已知的位置。将所提出的方法的性能与其他基因组选择模型(即 RR-BLUP、Bayes B 和贝叶斯套索)进行了比较。在所提出的方法的性能与其他基因组选择模型(即 RR-BLUP、Bayes B 和贝叶斯套索)进行了比较。在所提出的方法的性能与其他基因组选择模型(即 RR-BLUP、Bayes B 和贝叶斯套索)进行了比较。贝叶斯功能模型在模拟和真实场景中具有不同 LD 结构时,与那些经典回归方法相比,表现出更高或相似的预测能力。一般来说,贝叶斯功能模型还实现了更高的计算效率,每轮 MCMC 使用 12 个 SNP。该模型在识别因果区域方面非常有效,并表现出很高的分析灵活性,因为它很容易适应任何基因组选择模型。