...
首页> 外文期刊>The American Journal of Human Genetics >A unified approach to genotype imputation and haplotype-phase inference for large data sets of trios and unrelated individuals.
【24h】

A unified approach to genotype imputation and haplotype-phase inference for large data sets of trios and unrelated individuals.

机译:三重奏和无关个人的大数据集的基因型估算和单倍型阶段推断的统一方法。

获取原文
获取原文并翻译 | 示例
           

摘要

We present methods for imputing data for ungenotyped markers and for inferring haplotype phase in large data sets of unrelated individuals and parent-offspring trios. Our methods make use of known haplotype phase when it is available, and our methods are computationally efficient so that the full information in large reference panels with thousands of individuals is utilized. We demonstrate that substantial gains in imputation accuracy accrue with increasingly large reference panel sizes, particularly when imputing low-frequency variants, and that unphased reference panels can provide highly accurate genotype imputation. We place our methodology in a unified framework that enables the simultaneous use of unphased and phased data from trios and unrelated individuals in a single analysis. For unrelated individuals, our imputation methods produce well-calibrated posterior genotype probabilities and highly accurate allele-frequency estimates. For trios, our haplotype-inference method is four orders of magnitude faster than the gold-standard PHASE program and has excellent accuracy. Our methods enable genotype imputation to be performed with unphased trio or unrelated reference panels, thus accounting for haplotype-phase uncertainty in the reference panel. We present a useful measure of imputation accuracy, allelic R(2), and show that this measure can be estimated accurately from posterior genotype probabilities. Our methods are implemented in version 3.0 of the BEAGLE software package.
机译:我们介绍了为非基因型标记物估算数据和推断无关个体和亲子三胞胎的大数据集中的单倍型阶段的方法。我们的方法在可用时使用已知的单倍型阶段,并且我们的方法计算效率高,因此可以利用具有数千个个体的大型参考面板中的全部信息。我们证明,随着参考面板尺寸的不断增大,尤其是在插补低频变体时,插补精度会获得实质性提高,并且无相交参考面板可以提供高精度的基因型插补。我们将方法论放在一个统一的框架中,该框架可以在一次分析中同时使用来自三重奏和无关亲戚的无相数据和有相数据。对于无关的个体,我们的估算方法可产生经过良好校准的后基因型概率和高度准确的等位基因频率估计。对于三重奏,我们的单倍型推断方法比黄金标准的PHASE程序快四个数量级,并且具有出色的准确性。我们的方法使基因型插补可以在无相配的三重奏或不相关的参考面板上进行,从而解决了参考面板中单倍型相位不确定性的问题。我们提出了一种估算归因准确度,等位基因R(2)的有用方法,并表明可以从后代基因型概率准确估算该方法。我们的方法在BEAGLE软件包的3.0版中实现。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号