基于上位性模型的基因组预测：关于扩展GBLUP的标记编码依赖性性能及分类上位性模型（CE）的性质

Genomic prediction with epistasis models: on the marker-coding-dependent performance of the extended GBLUP and properties of the categorical epistasis model (CE).

作者信息

Martini Johannes W R, Gao Ning, Cardoso Diercles F, Wimmer Valentin, Erbe Malena, Cantet Rodolfo J C, Simianer Henner

机构信息

Department of Animal Sciences, Georg-August University, Albrecht Thaer-Weg 3, Göttingen, Germany.

National Engineering Research Center for Breeding Swine Industry, Guangdong Provincial Key Lab of Agro-animal Genomics and Molecular Breeding, College of Animal Science, South China Agricultural University, Guangzhou, China.

出版信息

BMC Bioinformatics. 2017 Jan 3;18(1):3. doi: 10.1186/s12859-016-1439-1.

DOI:10.1186/s12859-016-1439-1

PMID:28049412

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5209948/

Abstract

BACKGROUND

Epistasis marker effect models incorporating products of marker values as predictor variables in a linear regression approach (extended GBLUP, EGBLUP) have been assessed as potentially beneficial for genomic prediction, but their performance depends on marker coding. Although this fact has been recognized in literature, the nature of the problem has not been thoroughly investigated so far.

RESULTS

We illustrate how the choice of marker coding implicitly specifies the model of how effects of certain allele combinations at different loci contribute to the phenotype, and investigate coding-dependent properties of EGBLUP. Moreover, we discuss an alternative categorical epistasis model (CE) eliminating undesired properties of EGBLUP and show that the CE model can improve predictive ability. Finally, we demonstrate that the coding-dependent performance of EGBLUP offers the possibility to incorporate prior experimental information into the prediction method by adapting the coding to already available phenotypic records on other traits.

CONCLUSION

Based on our results, for EGBLUP, a symmetric coding {-1,1} or {-1,0,1} should be preferred, whereas a standardization using allele frequencies should be avoided. Moreover, CE can be a valuable alternative since it does not possess the undesired theoretical properties of EGBLUP. However, which model performs best will depend on characteristics of the data and available prior information. Data from previous experiments can for instance be incorporated into the marker coding of EGBLUP.

摘要

背景

在基因组预测中，将标记值的乘积作为预测变量纳入线性回归方法的上位性标记效应模型（扩展GBLUP，EGBLUP）已被评估为可能有益，但它们的性能取决于标记编码。尽管这一事实在文献中已得到认可，但到目前为止，该问题的本质尚未得到彻底研究。

结果

我们说明了标记编码的选择如何隐含地指定了不同位点上某些等位基因组合的效应如何影响表型的模型，并研究了EGBLUP的编码依赖性属性。此外，我们讨论了一种替代的分类上位性模型（CE），该模型消除了EGBLUP的不良属性，并表明CE模型可以提高预测能力。最后，我们证明了EGBLUP的编码依赖性性能提供了通过根据其他性状上已有的表型记录调整编码，将先验实验信息纳入预测方法的可能性。

结论

根据我们的结果，对于EGBLUP，应首选对称编码{-1,1}或{-1,0,1}，而应避免使用等位基因频率进行标准化。此外，CE可能是一种有价值的替代方案，因为它不具有EGBLUP的不良理论属性。然而，哪种模型表现最佳将取决于数据的特征和可用的先验信息。例如，以前实验的数据可以纳入EGBLUP的标记编码中。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f3fd/5209948/3ce0574f6c69/12859_2016_1439_Fig1_HTML.jpg

相似文献

Genomic prediction with epistasis models: on the marker-coding-dependent performance of the extended GBLUP and properties of the categorical epistasis model (CE).基于上位性模型的基因组预测：关于扩展GBLUP的标记编码依赖性性能及分类上位性模型（CE）的性质

BMC Bioinformatics. 2017 Jan 3;18(1):3. doi: 10.1186/s12859-016-1439-1.

Lost in Translation: On the Problem of Data Coding in Penalized Whole Genome Regression with Interactions.翻译中的迷失：关于带交互项的惩罚全基因组回归中的数据编码问题

G3 (Bethesda). 2019 Apr 9;9(4):1117-1129. doi: 10.1534/g3.118.200961.

Phenotype Prediction Under Epistasis.上位性下的表型预测。

Methods Mol Biol. 2021;2212:105-120. doi: 10.1007/978-1-0716-0947-7_8.

Modeling Epistasis in Genomic Selection.遗传选择中的上位性建模。

Genetics. 2015 Oct;201(2):759-68. doi: 10.1534/genetics.115.177907. Epub 2015 Jul 27.

Accounting for epistasis improves genomic prediction of phenotypes with univariate and bivariate models across environments.在单变量和双变量模型中，考虑上位性可提高表型的基因组预测在不同环境下的准确性。

Theor Appl Genet. 2021 Sep;134(9):2913-2930. doi: 10.1007/s00122-021-03868-1. Epub 2021 Jun 11.

deepGBLUP: joint deep learning networks and GBLUP framework for accurate genomic prediction of complex traits in Korean native cattle.深度 GBLUP：联合深度学习网络和 GBLUP 框架，用于准确预测韩国本土牛复杂性状的基因组。

Genet Sel Evol. 2023 Jul 31;55(1):56. doi: 10.1186/s12711-023-00825-y.

Efficient weighting methods for genomic best linear-unbiased prediction (BLUP) adapted to the genetic architectures of quantitative traits.高效的基因组最佳线性无偏预测（BLUP）加权方法，适用于数量性状的遗传结构。

Heredity (Edinb). 2021 Feb;126(2):320-334. doi: 10.1038/s41437-020-00372-y. Epub 2020 Sep 26.

Genome-wide mapping and prediction suggests presence of local epistasis in a vast elite winter wheat populations adapted to Central Europe.全基因组定位和预测表明，在适应中欧的大量优质冬小麦群体中存在局部上位性。

Theor Appl Genet. 2017 Apr;130(4):635-647. doi: 10.1007/s00122-016-2840-x. Epub 2016 Dec 19.

Genomic studies with preselected markers reveal dominance effects influencing growth traits in Eucalyptus nitens.利用预选标记进行基因组研究揭示了影响辐射松生长性状的显性效应。

G3 (Bethesda). 2022 Jan 4;12(1). doi: 10.1093/g3journal/jkab363.

Evaluations of Genomic Prediction and Identification of New Loci for Resistance to Stripe Rust Disease in Wheat ( L.).小麦（L.）抗条锈病的基因组预测评估及新位点鉴定

Front Genet. 2021 Sep 28;12:710485. doi: 10.3389/fgene.2021.710485. eCollection 2021.

引用本文的文献

Optimization of recurrent rapid cycle breeding in maize for sustained long-term genetic improvement via stochastic simulations.通过随机模拟优化玉米轮回快速循环育种以实现长期持续的遗传改良

G3 (Bethesda). 2025 Jul 9;15(7). doi: 10.1093/g3journal/jkaf100.

Investigating the impact of non-additive genetic effects in the estimation of variance components and genomic predictions for heat tolerance and performance traits in crossbred and purebred pig populations.研究非加性遗传效应在杂种和纯种猪群体耐热性和生产性能性状的方差组分估计和基因组预测中的影响。

BMC Genom Data. 2023 Dec 13;24(1):76. doi: 10.1186/s12863-023-01174-x.

Genet Sel Evol. 2023 Jul 31;55(1):56. doi: 10.1186/s12711-023-00825-y.

Genomic prediction using information across years with epistatic models and dimension reduction via haplotype blocks.利用上位性模型和单倍型块进行降维，在多年信息上进行基因组预测。

PLoS One. 2023 Mar 31;18(3):e0282288. doi: 10.1371/journal.pone.0282288. eCollection 2023.

Incorporating Omics Data in Genomic Prediction.将组学数据纳入基因组预测

Methods Mol Biol. 2022;2467:341-357. doi: 10.1007/978-1-0716-2205-6_12.

Comparing Genomic Prediction Models by Means of Cross Validation.通过交叉验证比较基因组预测模型

Front Plant Sci. 2021 Nov 19;12:734512. doi: 10.3389/fpls.2021.734512. eCollection 2021.

Single-Trait and Multiple-Trait Genomic Prediction From Multi-Class Bayesian Alphabet Models Using Biological Information.基于生物信息的多类贝叶斯字母模型的单性状和多性状基因组预测

Front Genet. 2021 Oct 11;12:717457. doi: 10.3389/fgene.2021.717457. eCollection 2021.

MIDESP: Mutual Information-Based Detection of Epistatic SNP Pairs for Qualitative and Quantitative Phenotypes.MIDESP：基于互信息的定性和定量表型上位性SNP对检测

Biology (Basel). 2021 Sep 16;10(9):921. doi: 10.3390/biology10090921.

Robust modeling of additive and nonadditive variation with intuitive inclusion of expert knowledge.稳健建模的添加剂和非添加剂变化与直观的专家知识纳入。

Genetics. 2021 Mar 31;217(3). doi: 10.1093/genetics/iyab002.

ANOVA-HD: Analysis of variance when both input and output layers are high-dimensional.ANOVA-HD：输入和输出层均为高维时的方差分析。

PLoS One. 2020 Dec 14;15(12):e0243251. doi: 10.1371/journal.pone.0243251. eCollection 2020.

本文引用的文献

Does encoding matter? A novel view on the quantitative genetic trait prediction problem.编码重要吗？关于数量遗传性状预测问题的新观点。

BMC Bioinformatics. 2016 Jul 19;17 Suppl 9(Suppl 9):272. doi: 10.1186/s12859-016-1127-1.

Epistasis and covariance: how gene interaction translates into genomic relationship.上位性和协方差：基因互作如何转化为基因组关系。

Theor Appl Genet. 2016 May;129(5):963-76. doi: 10.1007/s00122-016-2675-5. Epub 2016 Feb 16.

Modeling Epistasis in Genomic Selection.遗传选择中的上位性建模。

Genetics. 2015 Oct;201(2):759-68. doi: 10.1534/genetics.115.177907. Epub 2015 Jul 27.

Accounting for genetic architecture improves sequence based genomic prediction for a Drosophila fitness trait.考虑遗传结构可改善基于序列的果蝇适应性性状基因组预测。

PLoS One. 2015 May 7;10(5):e0126880. doi: 10.1371/journal.pone.0126880. eCollection 2015.

Data-driven encoding for quantitative genetic trait prediction.基于数据驱动的定量遗传性状预测编码。

BMC Bioinformatics. 2015;16 Suppl 1(Suppl 1):S10. doi: 10.1186/1471-2105-16-S1-S10. Epub 2015 Feb 18.

Improving the accuracy of whole genome prediction for complex traits using the results of genome wide association studies.利用全基因组关联研究结果提高复杂性状全基因组预测的准确性。

PLoS One. 2014 Mar 24;9(3):e93017. doi: 10.1371/journal.pone.0093017. eCollection 2014.

Accurate and robust genomic prediction of celiac disease using statistical learning.使用统计学习对乳糜泻进行准确且稳健的基因组预测。

PLoS Genet. 2014 Feb 13;10(2):e1004137. doi: 10.1371/journal.pgen.1004137. eCollection 2014 Feb.

Epistasis and quantitative traits: using model organisms to study gene-gene interactions.上位性与数量性状：利用模式生物研究基因-基因相互作用。

Nat Rev Genet. 2014 Jan;15(1):22-33. doi: 10.1038/nrg3627. Epub 2013 Dec 3.

Estimating additive and non-additive genetic variances and predicting genetic merits using genome-wide dense single nucleotide polymorphism markers.利用全基因组高密度单核苷酸多态性标记估计加性和非加性遗传方差及预测遗传优势。

PLoS One. 2012;7(9):e45293. doi: 10.1371/journal.pone.0045293. Epub 2012 Sep 13.

Prediction of genetic values of quantitative traits with epistatic effects in plant breeding populations.利用植物育种群体中的上位效应预测数量性状的遗传值。

Heredity (Edinb). 2012 Nov;109(5):313-9. doi: 10.1038/hdy.2012.44. Epub 2012 Aug 15.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

基于上位性模型的基因组预测：关于扩展GBLUP的标记编码依赖性性能及分类上位性模型（CE）的性质

Genomic prediction with epistasis models: on the marker-coding-dependent performance of the extended GBLUP and properties of the categorical epistasis model (CE).

作者信息

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSION

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献