Suppr超能文献

计算基因组预测的有效方法。

Efficient methods to compute genomic predictions.

作者信息

VanRaden P M

机构信息

Animal Improvement Programs Laboratory, Agricultural Research Service, USDA, Beltsville, MD 20705-2350, USA.

出版信息

J Dairy Sci. 2008 Nov;91(11):4414-23. doi: 10.3168/jds.2007-0980.

Abstract

Efficient methods for processing genomic data were developed to increase reliability of estimated breeding values and to estimate thousands of marker effects simultaneously. Algorithms were derived and computer programs tested with simulated data for 2,967 bulls and 50,000 markers distributed randomly across 30 chromosomes. Estimation of genomic inbreeding coefficients required accurate estimates of allele frequencies in the base population. Linear model predictions of breeding values were computed by 3 equivalent methods: 1) iteration for individual allele effects followed by summation across loci to obtain estimated breeding values, 2) selection index including a genomic relationship matrix, and 3) mixed model equations including the inverse of genomic relationships. A blend of first- and second-order Jacobi iteration using 2 separate relaxation factors converged well for allele frequencies and effects. Reliability of predicted net merit for young bulls was 63% compared with 32% using the traditional relationship matrix. Nonlinear predictions were also computed using iteration on data and nonlinear regression on marker deviations; an additional (about 3%) gain in reliability for young bulls increased average reliability to 66%. Computing times increased linearly with number of genotypes. Estimation of allele frequencies required 2 processor days, and genomic predictions required <1 d per trait, and traits were processed in parallel. Information from genotyping was equivalent to about 20 daughters with phenotypic records. Actual gains may differ because the simulation did not account for linkage disequilibrium in the base population or selection in subsequent generations.

摘要

开发了用于处理基因组数据的高效方法,以提高估计育种值的可靠性,并同时估计数千个标记效应。推导了算法,并使用模拟数据对2967头公牛和随机分布在30条染色体上的50000个标记进行了计算机程序测试。基因组近交系数的估计需要准确估计基础群体中的等位基因频率。育种值的线性模型预测通过3种等效方法计算:1)对单个等位基因效应进行迭代,然后跨基因座求和以获得估计育种值;2)包括基因组关系矩阵的选择指数;3)包括基因组关系逆矩阵的混合模型方程。使用2个单独的松弛因子的一阶和二阶雅可比迭代混合对于等位基因频率和效应收敛良好。与使用传统关系矩阵相比,年轻公牛预测净遗传值的可靠性为63%,而传统方法为32%。还使用数据迭代和标记偏差的非线性回归进行了非线性预测;年轻公牛的可靠性额外提高了约3%,平均可靠性提高到66%。计算时间随基因型数量线性增加。等位基因频率估计需要2个处理器日,基因组预测每个性状需要不到1天,并且性状是并行处理的。基因分型信息相当于约20个有表型记录的女儿。实际增益可能不同,因为模拟未考虑基础群体中的连锁不平衡或后代中的选择。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验