...
首页> 外文期刊>Journal of Theoretical Biology >Maximum likelihood model based on minor allele frequencies and weighted Max-SAT formulation for haplotype assembly
【24h】

Maximum likelihood model based on minor allele frequencies and weighted Max-SAT formulation for haplotype assembly

机译:基于次要等位基因频率和加权Max-SAT公式的单倍型装配的最大似然模型

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

Human haplotypes include essential information about SNPs, which in turn provide valuable information for such studies as finding relationships between some diseases and their potential genetic causes, e.g., for Genome Wide Association Studies. Due to expensiveness of directly determining haplotypes and recent progress in high throughput sequencing, there has been an increasing motivation for haplotype assembly, which is the problem of finding a pair of haplotypes from a set of aligned fragments. Although the problem has been extensively studied and a number of algorithms have already been proposed for the problem, more accurate methods are still beneficial because of high importance of the haplotypes information. In this paper, first, we develop a probabilistic model, that incorporates the Minor Allele Frequency (MAF) of SNP sites, which is missed in the existing maximum likelihood models. Then, we show that the probabilistic model will reduce to the Minimum Error Correction (MEC) model when the information of MAF is omitted and some approximations are made. This result provides a novel theoretical support for the MEC, despite some criticisms against it in the recent literature. Next, under the same approximations, we simplify the model to an extension of the MEC in which the information of MAF is used. Finally, we extend the haplotype assembly algorithm HapSAT by developing a weighted Max-SAT formulation for the simplified model, which is evaluated empirically with positive results. (C) 2014 Elsevier Ltd. All rights reserved.
机译:人类单倍型包括有关SNP的基本信息,从而为诸如发现某些疾病与其潜在遗传原因之间的关系等研究提供有价值的信息,例如用于全基因组关联研究。由于直接确定单倍型的昂贵性和高通量测序的最新进展,单倍型组装的动机日益增加,这是从一组比对的片段中寻找一对单倍型的问题。尽管已经对该问题进行了广泛的研究,并且已经针对该问题提出了许多算法,但是由于单倍型信息的重要性很高,因此更准确的方法仍然是有益的。在本文中,首先,我们开发了一个概率模型,该模型结合了SNP位点的次要等位基因频率(MAF),而现有最大似然模型中却没有包含该概率模型。然后,我们表明当省略MAF信息并进行一些近似时,概率模型将减少为最小错误校正(MEC)模型。尽管最近的文献对此提出了批评,但这一结果为MEC提供了新颖的理论支持。接下来,在相同的近似值下,我们将模型简化为使用MAF信息的MEC的扩展。最后,通过为简化模型开发加权Max-SAT公式,扩展了单体型装配算法HapSAT,该公式通过经验得到了积极的评价。 (C)2014 Elsevier Ltd.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号