...
首页> 外文期刊>BMC Bioinformatics >Genomic selection and complex trait prediction using a fast EM algorithm applied to genome-wide markers
【24h】

Genomic selection and complex trait prediction using a fast EM algorithm applied to genome-wide markers

机译:使用适用于全基因组标记物的快速EM算法进行基因组选择和复杂性状预测

获取原文
   

获取外文期刊封面封底 >>

       

摘要

Background The information provided by dense genome-wide markers using high throughput technology is of considerable potential in human disease studies and livestock breeding programs. Genome-wide association studies relate individual single nucleotide polymorphisms (SNP) from dense SNP panels to individual measurements of complex traits, with the underlying assumption being that any association is caused by linkage disequilibrium (LD) between SNP and quantitative trait loci (QTL) affecting the trait. Often SNP are in genomic regions of no trait variation. Whole genome Bayesian models are an effective way of incorporating this and other important prior information into modelling. However a full Bayesian analysis is often not feasible due to the large computational time involved. Results This article proposes an expectation-maximization (EM) algorithm called emBayesB which allows only a proportion of SNP to be in LD with QTL and incorporates prior information about the distribution of SNP effects. The posterior probability of being in LD with at least one QTL is calculated for each SNP along with estimates of the hyperparameters for the mixture prior. A simulated example of genomic selection from an international workshop is used to demonstrate the features of the EM algorithm. The accuracy of prediction is comparable to a full Bayesian analysis but the EM algorithm is considerably faster. The EM algorithm was accurate in locating QTL which explained more than 1% of the total genetic variation. A computational algorithm for very large SNP panels is described. Conclusions emBayesB is a fast and accurate EM algorithm for implementing genomic selection and predicting complex traits by mapping QTL in genome-wide dense SNP marker data. Its accuracy is similar to Bayesian methods but it takes only a fraction of the time.
机译:背景技术由密集的全基因组标记物使用高通量技术提供的信息在人类疾病研究和牲畜育种计划中具有巨大潜力。全基因组关联研究将密集的SNP面板中的单个单核苷酸多态性(SNP)与复杂性状的单独测量相关联,基本假设是任何关联都是由SNP与定量性状基因座(QTL)之间的连锁不平衡(LD)引起的特质。 SNP通常位于无性状变异的基因组区域。全基因组贝叶斯模型是将此信息和其他重要先验信息整合到建模中的有效方法。但是,由于涉及大量的计算时间,因此无法进行完整的贝叶斯分析。结果本文提出了一种称为emBayesB的期望最大化(EM)算法,该算法仅允许将一部分SNP与QTL结合在LD中,并结合了有关SNP效应分布的先验信息。对于每个SNP,计算至少一个QTL进入LD的后验概率,以及先前混合物的超参数估计值。来自国际研讨会的基因组选择的模拟示例被用来证明EM算法的功能。预测的准确性与完整的贝叶斯分析相当,但是EM算法的速度要快得多。 EM算法在定位QTL方面准确无误,可解释超过1%的总遗传变异。描述了用于非常大的SNP面板的计算算法。结论emBayesB是一种快速准确的EM算法,可通过在全基因组密集SNP标记数据中绘制QTL来实现基因组选择和预测复杂性状。它的准确性类似于贝叶斯方法,但只花费一小部分时间。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号