• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种使用贝叶斯模型进行基因组预测的计算高效算法。

A computationally efficient algorithm for genomic prediction using a Bayesian model.

作者信息

Wang Tingting, Chen Yi-Ping Phoebe, Goddard Michael E, Meuwissen Theo H E, Kemper Kathryn E, Hayes Ben J

机构信息

Faculty of Science, Technology and Engineering, La Trobe University, Melbourne, VIC, 3086, Australia.

Biosciences Research Division, Department of Primary Industries, Bundoora, Melbourne, VIC, 3083, Australia.

出版信息

Genet Sel Evol. 2015 Apr 30;47(1):34. doi: 10.1186/s12711-014-0082-4.

DOI:10.1186/s12711-014-0082-4
PMID:25926276
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4415253/
Abstract

BACKGROUND

Genomic prediction of breeding values from dense single nucleotide polymorphisms (SNP) genotypes is used for livestock and crop breeding, and can also be used to predict disease risk in humans. For some traits, the most accurate genomic predictions are achieved with non-linear estimates of SNP effects from Bayesian methods that treat SNP effects as random effects from a heavy tailed prior distribution. These Bayesian methods are usually implemented via Markov chain Monte Carlo (MCMC) schemes to sample from the posterior distribution of SNP effects, which is computationally expensive. Our aim was to develop an efficient expectation-maximisation algorithm (emBayesR) that gives similar estimates of SNP effects and accuracies of genomic prediction than the MCMC implementation of BayesR (a Bayesian method for genomic prediction), but with greatly reduced computation time.

METHODS

emBayesR is an approximate EM algorithm that retains the BayesR model assumption with SNP effects sampled from a mixture of normal distributions with increasing variance. emBayesR differs from other proposed non-MCMC implementations of Bayesian methods for genomic prediction in that it estimates the effect of each SNP while allowing for the error associated with estimation of all other SNP effects. emBayesR was compared to BayesR using simulated data, and real dairy cattle data with 632 003 SNPs genotyped, to determine if the MCMC and the expectation-maximisation approaches give similar accuracies of genomic prediction.

RESULTS

We were able to demonstrate that allowing for the error associated with estimation of other SNP effects when estimating the effect of each SNP in emBayesR improved the accuracy of genomic prediction over emBayesR without including this error correction, with both simulated and real data. When averaged over nine dairy traits, the accuracy of genomic prediction with emBayesR was only 0.5% lower than that from BayesR. However, emBayesR reduced computing time up to 8-fold compared to BayesR.

CONCLUSIONS

The emBayesR algorithm described here achieved similar accuracies of genomic prediction to BayesR for a range of simulated and real 630 K dairy SNP data. emBayesR needs less computing time than BayesR, which will allow it to be applied to larger datasets.

摘要

背景

利用密集单核苷酸多态性(SNP)基因型对育种值进行基因组预测已应用于家畜和作物育种,也可用于预测人类疾病风险。对于某些性状,通过贝叶斯方法对SNP效应进行非线性估计可实现最准确的基因组预测,该方法将SNP效应视为来自重尾先验分布的随机效应。这些贝叶斯方法通常通过马尔可夫链蒙特卡罗(MCMC)方案来从SNP效应的后验分布中抽样,计算成本很高。我们的目标是开发一种高效的期望最大化算法(emBayesR),它能给出与BayesR(一种用于基因组预测的贝叶斯方法)的MCMC实现类似的SNP效应估计和基因组预测准确性,但计算时间大幅减少。

方法

emBayesR是一种近似期望最大化算法,保留了BayesR模型假设,SNP效应从方差递增的正态分布混合中抽样。emBayesR与其他提出的用于基因组预测的贝叶斯方法的非MCMC实现的不同之处在于,它在估计每个SNP的效应时考虑了与所有其他SNP效应估计相关的误差。使用模拟数据以及对632003个SNP进行基因分型的真实奶牛数据,将emBayesR与BayesR进行比较,以确定MCMC方法和期望最大化方法是否给出相似的基因组预测准确性。

结果

我们能够证明,在emBayesR中估计每个SNP效应时考虑与其他SNP效应估计相关的误差,相比于不包括此误差校正的emBayesR,在模拟数据和真实数据中均提高了基因组预测的准确性。对九个奶牛性状进行平均时,emBayesR的基因组预测准确性仅比BayesR低0.5%。然而,与BayesR相比,emBayesR将计算时间减少了多达8倍。

结论

本文描述的emBayesR算法对于一系列模拟和真实的630K奶牛SNP数据,实现了与BayesR相似的基因组预测准确性。emBayesR所需的计算时间比BayesR少,这将使其能够应用于更大的数据集。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0025/4415253/616ef5c89231/12711_2014_82_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0025/4415253/1763c11081a6/12711_2014_82_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0025/4415253/7cdbbafc82b7/12711_2014_82_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0025/4415253/5d3c5ad96946/12711_2014_82_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0025/4415253/4b7d07ead2d7/12711_2014_82_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0025/4415253/9abdbe4c892b/12711_2014_82_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0025/4415253/15f931a0cb0c/12711_2014_82_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0025/4415253/616ef5c89231/12711_2014_82_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0025/4415253/1763c11081a6/12711_2014_82_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0025/4415253/7cdbbafc82b7/12711_2014_82_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0025/4415253/5d3c5ad96946/12711_2014_82_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0025/4415253/4b7d07ead2d7/12711_2014_82_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0025/4415253/9abdbe4c892b/12711_2014_82_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0025/4415253/15f931a0cb0c/12711_2014_82_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0025/4415253/616ef5c89231/12711_2014_82_Fig7_HTML.jpg

相似文献

1
A computationally efficient algorithm for genomic prediction using a Bayesian model.一种使用贝叶斯模型进行基因组预测的计算高效算法。
Genet Sel Evol. 2015 Apr 30;47(1):34. doi: 10.1186/s12711-014-0082-4.
2
Application of a Bayesian non-linear model hybrid scheme to sequence data for genomic prediction and QTL mapping.贝叶斯非线性模型混合方案在基因组预测和QTL定位序列数据中的应用。
BMC Genomics. 2017 Aug 15;18(1):618. doi: 10.1186/s12864-017-4030-x.
3
Variable selection models for genomic selection using whole-genome sequence data and singular value decomposition.基于全基因组序列数据和奇异值分解的基因组选择变量选择模型。
Genet Sel Evol. 2017 Dec 27;49(1):94. doi: 10.1186/s12711-017-0369-3.
4
A hybrid expectation maximisation and MCMC sampling algorithm to implement Bayesian mixture model based genomic prediction and QTL mapping.一种基于贝叶斯混合模型的基因组预测和QTL定位的混合期望最大化与马尔可夫链蒙特卡罗采样算法。
BMC Genomics. 2016 Sep 21;17(1):744. doi: 10.1186/s12864-016-3082-7.
5
Improving accuracy of genomic predictions within and between dairy cattle breeds with imputed high-density single nucleotide polymorphism panels.利用高分辨率单核苷酸多态性面板提高奶牛品种内和品种间基因组预测的准确性。
J Dairy Sci. 2012 Jul;95(7):4114-29. doi: 10.3168/jds.2011-5019.
6
Fast genomic prediction of breeding values using parallel Markov chain Monte Carlo with convergence diagnosis.利用具有收敛诊断的并行马尔可夫链蒙特卡罗方法快速预测育种值。
BMC Bioinformatics. 2018 Jan 3;19(1):3. doi: 10.1186/s12859-017-2003-3.
7
A multi-trait Bayesian method for mapping QTL and genomic prediction.一种用于 QTL 作图和基因组预测的多性状贝叶斯方法。
Genet Sel Evol. 2018 Mar 24;50(1):10. doi: 10.1186/s12711-018-0377-y.
8
Exploiting biological priors and sequence variants enhances QTL discovery and genomic prediction of complex traits.利用生物学先验知识和序列变异可增强复杂性状的数量性状基因座发现及基因组预测。
BMC Genomics. 2016 Feb 27;17:144. doi: 10.1186/s12864-016-2443-6.
9
Accuracy of prediction of simulated polygenic phenotypes and their underlying quantitative trait loci genotypes using real or imputed whole-genome markers in cattle.利用真实或推算的全基因组标记预测牛模拟多基因表型及其潜在数量性状位点基因型的准确性。
Genet Sel Evol. 2015 Dec 23;47:99. doi: 10.1186/s12711-015-0179-4.
10
Genomic prediction using an iterative conditional expectation algorithm for a fast BayesC-like model.使用迭代条件期望算法对类快速贝叶斯C模型进行基因组预测。
Genetica. 2018 Oct;146(4-5):361-368. doi: 10.1007/s10709-018-0027-x. Epub 2018 Jun 11.

引用本文的文献

1
Efficient large-scale genomic prediction in approximate genome-based kernel model.基于近似基因组的核模型中的高效大规模基因组预测
Theor Appl Genet. 2024 Dec 12;138(1):6. doi: 10.1007/s00122-024-04793-9.
2
GWABLUP: genome-wide association assisted best linear unbiased prediction of genetic values.GWABLUP:基于全基因组关联的最佳线性无偏遗传预测。
Genet Sel Evol. 2024 Mar 1;56(1):17. doi: 10.1186/s12711-024-00881-y.
3
Genomic Predictions in Korean Hanwoo Cows: A Comparative Analysis of Genomic BLUP and Bayesian Methods for Reproductive Traits.

本文引用的文献

1
Improved precision of QTL mapping using a nonlinear Bayesian method in a multi-breed population leads to greater accuracy of across-breed genomic predictions.在多品种群体中使用非线性贝叶斯方法提高QTL定位的精度,可提高跨品种基因组预测的准确性。
Genet Sel Evol. 2015 Apr 17;47(1):29. doi: 10.1186/s12711-014-0074-4.
2
A class of Bayesian methods to combine large numbers of genotyped and non-genotyped animals for whole-genome analyses.一类用于全基因组分析的贝叶斯方法,可结合大量基因分型和未基因分型的动物。
Genet Sel Evol. 2014 Sep 22;46(1):50. doi: 10.1186/1297-9686-46-50.
3
The effects of demography and long-term selection on the accuracy of genomic prediction with sequence data.
韩国韩牛的基因组预测:生殖性状的基因组最佳线性无偏预测(GBLUP)和贝叶斯方法的比较分析
Animals (Basel). 2023 Dec 20;14(1):27. doi: 10.3390/ani14010027.
4
A dimensionality-reduction genomic prediction method without direct inverse of the genomic relationship matrix for large genomic data.一种用于大型基因组数据的、无需对基因组关系矩阵求直接逆矩阵的降维基因组预测方法。
Plant Cell Rep. 2023 Nov;42(11):1825-1832. doi: 10.1007/s00299-023-03069-8. Epub 2023 Sep 26.
5
An Improved Bayesian Shrinkage Regression Algorithm for Genomic Selection.基于贝叶斯收缩回归算法的基因组选择改进方法。
Genes (Basel). 2022 Nov 23;13(12):2193. doi: 10.3390/genes13122193.
6
BayesR3 enables fast MCMC blocked processing for largescale multi-trait genomic prediction and QTN mapping analysis.贝叶斯 R3 能够实现大规模多性状基因组预测和 QTN 映射分析的快速 MCMC 块处理。
Commun Biol. 2022 Jul 5;5(1):661. doi: 10.1038/s42003-022-03624-1.
7
Utilization Strategies of Two Environment Phenotypes in Genomic Prediction.两种环境表型在基因组预测中的利用策略。
Genes (Basel). 2022 Apr 20;13(5):722. doi: 10.3390/genes13050722.
8
On the use of whole-genome sequence data for across-breed genomic prediction and fine-scale mapping of QTL.关于全基因组序列数据在跨品种基因组预测和QTL精细定位中的应用。
Genet Sel Evol. 2021 Feb 26;53(1):19. doi: 10.1186/s12711-021-00607-4.
9
Improving Genomic Prediction of Crossbred and Purebred Dairy Cattle.提高杂交和纯种奶牛的基因组预测
Front Genet. 2020 Dec 14;11:598580. doi: 10.3389/fgene.2020.598580. eCollection 2020.
10
Genome-Wide Prediction of Complex Traits in Two Outcrossing Plant Species Through Deep Learning and Bayesian Regularized Neural Network.通过深度学习和贝叶斯正则化神经网络对两种异交植物物种复杂性状进行全基因组预测
Front Plant Sci. 2020 Nov 27;11:593897. doi: 10.3389/fpls.2020.593897. eCollection 2020.
人口统计学和长期选择对基于序列数据的基因组预测准确性的影响。
Genetics. 2014 Dec;198(4):1671-84. doi: 10.1534/genetics.114.168344. Epub 2014 Sep 18.
4
A single-step genomic model with direct estimation of marker effects.一种直接估计标记效应的单步基因组模型。
J Dairy Sci. 2014 Sep;97(9):5833-50. doi: 10.3168/jds.2014-7924. Epub 2014 Jul 11.
5
Using recursion to compute the inverse of the genomic relationship matrix.使用递归计算基因组关系矩阵的逆矩阵。
J Dairy Sci. 2014;97(6):3943-52. doi: 10.3168/jds.2013-7752. Epub 2014 Mar 27.
6
Accuracy of genomic prediction using different models and response variables in the Nordic Red cattle population.在北欧红牛群体中使用不同模型和响应变量进行基因组预测的准确性。
J Anim Breed Genet. 2013 Oct;130(5):333-40. doi: 10.1111/jbg.12039. Epub 2013 Apr 26.
7
Genome-wide prediction of traits with different genetic architecture through efficient variable selection.通过有效的变量选择对具有不同遗传结构的性状进行全基因组预测。
Genetics. 2013 Oct;195(2):573-87. doi: 10.1534/genetics.113.150078. Epub 2013 Aug 9.
8
Accuracy of prediction of genomic breeding values for residual feed intake and carcass and meat quality traits in Bos taurus, Bos indicus, and composite beef cattle.预测肉牛、瘤牛和杂交肉牛的剩余采食量和胴体及肉质性状的基因组育种值的准确性。
J Anim Sci. 2013 Jul;91(7):3088-104. doi: 10.2527/jas.2012-5827. Epub 2013 May 8.
9
Priors in whole-genome regression: the bayesian alphabet returns.全基因组回归中的先验信息:贝叶斯字母表回归。
Genetics. 2013 Jul;194(3):573-96. doi: 10.1534/genetics.113.151753. Epub 2013 May 1.
10
Genomic prediction in animals and plants: simulation of data, validation, reporting, and benchmarking.动植物基因组预测:数据模拟、验证、报告和基准测试。
Genetics. 2013 Feb;193(2):347-65. doi: 10.1534/genetics.112.147983. Epub 2012 Dec 5.