Suppr超能文献

基于上位性模型的基因组预测:关于扩展GBLUP的标记编码依赖性性能及分类上位性模型(CE)的性质

Genomic prediction with epistasis models: on the marker-coding-dependent performance of the extended GBLUP and properties of the categorical epistasis model (CE).

作者信息

Martini Johannes W R, Gao Ning, Cardoso Diercles F, Wimmer Valentin, Erbe Malena, Cantet Rodolfo J C, Simianer Henner

机构信息

Department of Animal Sciences, Georg-August University, Albrecht Thaer-Weg 3, Göttingen, Germany.

National Engineering Research Center for Breeding Swine Industry, Guangdong Provincial Key Lab of Agro-animal Genomics and Molecular Breeding, College of Animal Science, South China Agricultural University, Guangzhou, China.

出版信息

BMC Bioinformatics. 2017 Jan 3;18(1):3. doi: 10.1186/s12859-016-1439-1.

Abstract

BACKGROUND

Epistasis marker effect models incorporating products of marker values as predictor variables in a linear regression approach (extended GBLUP, EGBLUP) have been assessed as potentially beneficial for genomic prediction, but their performance depends on marker coding. Although this fact has been recognized in literature, the nature of the problem has not been thoroughly investigated so far.

RESULTS

We illustrate how the choice of marker coding implicitly specifies the model of how effects of certain allele combinations at different loci contribute to the phenotype, and investigate coding-dependent properties of EGBLUP. Moreover, we discuss an alternative categorical epistasis model (CE) eliminating undesired properties of EGBLUP and show that the CE model can improve predictive ability. Finally, we demonstrate that the coding-dependent performance of EGBLUP offers the possibility to incorporate prior experimental information into the prediction method by adapting the coding to already available phenotypic records on other traits.

CONCLUSION

Based on our results, for EGBLUP, a symmetric coding {-1,1} or {-1,0,1} should be preferred, whereas a standardization using allele frequencies should be avoided. Moreover, CE can be a valuable alternative since it does not possess the undesired theoretical properties of EGBLUP. However, which model performs best will depend on characteristics of the data and available prior information. Data from previous experiments can for instance be incorporated into the marker coding of EGBLUP.

摘要

背景

在基因组预测中,将标记值的乘积作为预测变量纳入线性回归方法的上位性标记效应模型(扩展GBLUP,EGBLUP)已被评估为可能有益,但它们的性能取决于标记编码。尽管这一事实在文献中已得到认可,但到目前为止,该问题的本质尚未得到彻底研究。

结果

我们说明了标记编码的选择如何隐含地指定了不同位点上某些等位基因组合的效应如何影响表型的模型,并研究了EGBLUP的编码依赖性属性。此外,我们讨论了一种替代的分类上位性模型(CE),该模型消除了EGBLUP的不良属性,并表明CE模型可以提高预测能力。最后,我们证明了EGBLUP的编码依赖性性能提供了通过根据其他性状上已有的表型记录调整编码,将先验实验信息纳入预测方法的可能性。

结论

根据我们的结果,对于EGBLUP,应首选对称编码{-1,1}或{-1,0,1},而应避免使用等位基因频率进行标准化。此外,CE可能是一种有价值的替代方案,因为它不具有EGBLUP的不良理论属性。然而,哪种模型表现最佳将取决于数据的特征和可用的先验信息。例如,以前实验的数据可以纳入EGBLUP的标记编码中。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f3fd/5209948/3ce0574f6c69/12859_2016_1439_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验