首页> 外文期刊>Journal of dairy science >Technical Note: Computing Strategies in Genome-Wide Selection
【24h】

Technical Note: Computing Strategies in Genome-Wide Selection

机译:技术说明:全基因组选择中的计算策略

获取原文
           

摘要

Genome-wide genetic evaluation might involve the computation of BLUP-like estimations, potentially including thousands of covariates (i.e., single-nucleotide polymorphism markers) for each record. This implies dense Henderson's mixed-model equations and considerable computing resources in time and storage, even for a few thousand records. Possible computing options include the type of storage and the solving algorithm. This work evaluated several computing options, including half-stored Cholesky decomposition, Gauss-Seidel, and 3 matrix-free strategies: Gauss-Seidel, Gauss-Seidel with residuals update, and preconditioned conjugate gradients. Matrix-free Gauss-Seidel with residuals update adjusts the residuals after computing the solution for each effect. This avoids adjusting the left-hand side of the equations by all other effects at every step of the algorithm and saves considerable computing time. Any Gauss-Seidel algorithm can easily be extended for variance component estimation by Markov chain-Monte Carlo. Let m and n be the number of records and markers, respectively. Computing time for Cholesky decomposition is proportional to n~3. Computing times per round are proportional to mn~2 in matrix-free Gauss-Seidel, to n~2 for half-stored Gauss-Seidel, and to n and m for the rest of the algorithms. Algorithms were tested on a real mouse data set, which included 1,928 records and 10,946 single-nucleotide polymorphism markers. Computing times were in the order of a few minutes for Gauss-Seidel with residuals update and preconditioned conjugate gradients, more than 1 h for half-stored Gauss-Seidel, 2 h for Cholesky decomposition, and 4 d for matrix-free Gauss-Seidel. Preconditioned conjugate gradients was the fastest. Gauss-Seidel with residuals update would be the method of choice for variance component estimation as well as solving.
机译:全基因组的遗传评估可能涉及类似BLUP的估计的计算,每个记录可能包括数千个协变量(即单核苷酸多态性标记)。这意味着密集的亨德森(Henderson)混合模型方程式,以及在时间和存储上的可观计算资源,即使是几千条记录。可能的计算选项包括存储类型和求解算法。这项工作评估了几种计算选项,包括半存储的Cholesky分解,Gauss-Seidel和3种无矩阵策略:Gauss-Seidel,具有残差更新的Gauss-Seidel和预处理的共轭梯度。在计算每种效果的解后,具有残差更新的无矩阵高斯-赛德尔会调整残差。这避免了在算法的每个步骤中受所有其他影响来调整方程式的左侧,并节省了可观的计算时间。通过Markov链-蒙特卡洛,可以轻松地扩展任何高斯-赛德尔算法来进行方差分量估计。令m和n分别为记录数和标记数。 Cholesky分解的计算时间与n〜3成正比。每轮的计算时间与无矩阵Gauss-Seidel中的mn〜2成正比,对于半存储的Gauss-Seidel中的n〜2成正比,对于其余算法,n和m成正比。算法在真实的小鼠数据集上进行了测试,该数据集包含1,928条记录和10,946个单核苷酸多态性标记。具有残差更新和预处理共轭梯度的Gauss-Seidel的计算时间约为几分钟,半存储的Gauss-Seidel的计算时间超过1小时,Cholesky分解的计算时间超过2小时,无矩阵的Gauss-Seidel的计算时间为4 d 。预处理共轭梯度是最快的。具有残差更新的高斯-塞德尔将是方差分量估计和求解的选择方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号