...
首页> 外文期刊>BMC Bioinformatics >imputeqc: an R package for assessing imputation quality of genotypes and optimizing imputation parameters
【24h】

imputeqc: an R package for assessing imputation quality of genotypes and optimizing imputation parameters

机译:IMPUTEQC:用于评估基因型的撤销质量和优化归纳参数的R包

获取原文

摘要

BACKGROUND:The imputation of genotypes increases the power of genome-wide association studies. However, the imputation quality should be assessed in each particular case. Nevertheless, not all imputation softwares control the error of output, e.g., the last release of fastPHASE program (1.4.8) lacks such an option. In this particular software there is also an uncertainty in choosing the model parameters. fastPHASE is based on haplotype clusters, which size should be set a priori. The parameter influences the results of imputation and downstream analysis.RESULTS:We present a software toolkit imputeqc to assess the imputation quality and/or to choose the model parameters for imputation. We demonstrate the efficacy of toolkit for evaluation of imputations made with both fastPHASE and BEAGLE software for HapMap and 1000 Genomes data. The discordance of genotypes received correlated well in both methods. Using imputeqc, we also shown how to choose the optimal number of haplotype clusters and expectation-maximization cycles for fastPHASE program. The found number of haplotype clusters of 25 was further applied for hapFLK testing that revealed signatures of selection at LCT region on chromosome 2. We also demonstrated how to decrease the computational time in the case of hapFLK testing from 3 days to 20 h.CONCLUSIONS:The toolkit is implemented as an R package imputeqc and command line scripts. The code is freely available at https://github.com/inzilico/imputeqc under the MIT license.
机译:背景:基因型的归责增加了基因组关联研究的力量。但是,应在每个特定情况下评估估算质量。尽管如此,并非所有估算软件都控制输出的误差,例如,Fastphase程序(1.4.8)的最后一个释放缺少此类选项。在这种特定软件中,选择模型参数也存在不确定性。 Fasthase基于单倍型集群,尺寸应设置为先验。该参数影响归属和下游分析的结果。结果:我们提出了一种软件工具包普通问题,用于评估归纳质量和/或选择估算的模型参数。我们展示了工具包用于评估用HapMap和1000个基因组数据的Fasthase和Beagle软件进行的避难所的辅助性。在两种方法中接受基因型的不一致良好。使用Imputeqc,我们还显示了如何选择快速组件的单倍型集群和期望最大化周期的最佳数量。 25的发现数量的单倍型簇被进一步应用于HAPFLK测试,显示在染色体上的LCT区域的选择签名。我们还证明了如何降低3天至20h的HAPFLK测试的计算时间.Conclusions:工具包以R包Imputeqc和命令行脚本实现。代码在MIT许可证下在HTTPS://github.com/inzilico/imputeqc上自由使用。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号