Suppr超能文献

技术说明:全基因组选择中的计算策略。

Technical note: Computing strategies in genome-wide selection.

作者信息

Legarra A, Misztal I

机构信息

Institut National de la Recherche Agronomique, UR631 Station d'Amélioration Génétique des Animaux, BP 52627, 32326 Castanet-Tolosan, France.

出版信息

J Dairy Sci. 2008 Jan;91(1):360-6. doi: 10.3168/jds.2007-0403.

Abstract

Genome-wide genetic evaluation might involve the computation of BLUP-like estimations, potentially including thousands of covariates (i.e., single-nucleotide polymorphism markers) for each record. This implies dense Henderson's mixed-model equations and considerable computing resources in time and storage, even for a few thousand records. Possible computing options include the type of storage and the solving algorithm. This work evaluated several computing options, including half-stored Cholesky decomposition, Gauss-Seidel, and 3 matrix-free strategies: Gauss-Seidel, Gauss-Seidel with residuals update, and preconditioned conjugate gradients. Matrix-free Gauss-Seidel with residuals update adjusts the residuals after computing the solution for each effect. This avoids adjusting the left-hand side of the equations by all other effects at every step of the algorithm and saves considerable computing time. Any Gauss-Seidel algorithm can easily be extended for variance component estimation by Markov chain-Monte Carlo. Let m and n be the number of records and markers, respectively. Computing time for Cholesky decomposition is proportional to n3. Computing times per round are proportional to mn2 in matrix-free Gauss-Seidel, to n2 for half-stored Gauss-Seidel, and to n and m for the rest of the algorithms. Algorithms were tested on a real mouse data set, which included 1,928 records and 10,946 single-nucleotide polymorphism markers. Computing times were in the order of a few minutes for Gauss-Seidel with residuals update and preconditioned conjugate gradients, more than 1 h for half-stored Gauss-Seidel, 2 h for Cholesky decomposition, and 4 d for matrix-free Gauss-Seidel. Preconditioned conjugate gradients was the fastest. Gauss-Seidel with residuals update would be the method of choice for variance component estimation as well as solving.

摘要

全基因组遗传评估可能涉及类似最佳线性无偏预测(BLUP)估计值的计算,每个记录可能包含数千个协变量(即单核苷酸多态性标记)。这意味着即使对于几千条记录,也需要密集的亨德森混合模型方程以及大量的时间和存储计算资源。可能的计算选项包括存储类型和求解算法。这项工作评估了几种计算选项,包括半存储乔列斯基分解、高斯 - 赛德尔方法以及3种无矩阵策略:高斯 - 赛德尔方法、带残差更新的高斯 - 赛德尔方法和预处理共轭梯度法。带残差更新的无矩阵高斯 - 赛德尔方法在计算每个效应的解之后调整残差。这避免了在算法的每一步通过所有其他效应来调整方程的左侧,从而节省了大量计算时间。任何高斯 - 赛德尔算法都可以很容易地通过马尔可夫链蒙特卡罗方法扩展用于方差分量估计。设m和n分别为记录数和标记数。乔列斯基分解的计算时间与n³成正比。无矩阵高斯 - 赛德尔方法每轮的计算时间与mn²成正比,半存储高斯 - 赛德尔方法与n²成正比,其余算法与n和m成正比。算法在一个真实的小鼠数据集上进行了测试,该数据集包含1928条记录和10946个单核苷酸多态性标记。带残差更新的高斯 - 赛德尔方法和预处理共轭梯度法的计算时间为几分钟左右,半存储高斯 - 赛德尔方法超过1小时,乔列斯基分解为2小时,无矩阵高斯 - 赛德尔方法为4天。预处理共轭梯度法最快。带残差更新的高斯 - 赛德尔方法将是方差分量估计以及求解的首选方法。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验