首页> 外文期刊>Bioinformatics >HaploPool: improving haplotype frequency estimation through DNA pools and phylogenetic modeling
【24h】

HaploPool: improving haplotype frequency estimation through DNA pools and phylogenetic modeling

机译:HaploPool:通过DNA池和系统发育模型改善单倍型频率估计

获取原文
获取原文并翻译 | 示例
       

摘要

Motivation: The search for genetic variants that are linked to complex diseases such as cancer, Parkinson's, or Alzheimer's disease, may lead to better treatments. Since haplotypes can serve as proxies for hidden variants, one method of finding the linked variants is to look for case-control associations between the haplotypes and disease. Finding these associations requires a high-quality estimation of the haplotype frequencies in the population. To this end, we present, HaploPool, a method of estimating haplotype frequencies from blocks of consecutive SNPs. Results: HaploPool leverages the efficiency of DNA pools and estimates the population haplotype frequencies from pools of disjoint sets, each containing two or three unrelated individuals. We study the trade-off between pooling efficiency and accuracy of haplotype frequency estimates. For a fixed genotyping budget, HaploPool performs favorably on pools of two individuals as compared with a state-of-the-art non-pooled phasing method, PHASE. Of independent interest, HaploPool can be used to phase non-pooled genotype data with an accuracy approaching that of PHASE. We compared our algorithm to three programs that estimate haplotype frequencies from pooled data. HaploPool is an order of magnitude more efficient (at least six times faster), and considerably more accurate than previous methods. In contrast to previous methods, HaploPool performs well with missing data, genotyping errors and long haplotype blocks (of between 5 and 25 SNPs).
机译:动机:寻找与复杂疾病如癌症,帕金森氏症或阿尔茨海默氏病有关的遗传变异,可能会导致更好的治疗。由于单倍型可以用作隐藏变异的代理,因此找到链接变异的一种方法是寻找单倍型与疾病之间的病例对照关联。找到这些关联需要对种群中单倍型频率进行高质量的估计。为此,我们提出了HaploPool,一种从连续SNP块中估计单倍型频率的方法。结果:HaploPool利用DNA池的效率,并从不相交集的池中估计群体单倍型频率,每个不相交集包含两个或三个无关的个体。我们研究了合并效率和单倍型频率估计准确性之间的权衡。对于固定的基因分型预算,与最先进的非池定相方法PHASE相比,HaploPool在两个人的库中表现良好。具有独立兴趣的是,HaploPool可用于定相非池化基因型数据,其准确度接近PHASE。我们将算法与三个程序进行了比较,该程序可以从合并数据中估计单元型的频率。 HaploPool的效率要高一个数量级(至少快六倍),并且比以前的方法准确得多。与以前的方法相比,HaploPool在缺少数据,基因分型错误和长单倍型模块(5至25个SNP之间)的情况下表现良好。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号