...
首页> 外文期刊>Journal of computational biology: A journal of computational molecular cell biology >Inference of Haplotypes from Samples of Diploid Populations; Complexity and Algorithms
【24h】

Inference of Haplotypes from Samples of Diploid Populations; Complexity and Algorithms

机译:从二倍体种群样本中推断单倍型;复杂度和算法

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

The next phase of human genomics will involve large-scale screens of populations for significant DNA polymorphisms, notably single nucleotide polymorphisms (SNPs). Dense human SNP maps are currently under construction. However, the utility of those maps and screens will be limited by the fact that humans are diploid and it is presently difficult to get separate data on the two "copies." Hence, genotype (blended) SNP data will be collected, and the desired haplotype (partitioned) data must then be (partially) inferred. A particular nondeterministic inference algorithm was proposed and studied by Clark (1990) and extensively used by Clark et al. (1998). In this paper, we more closely examine that inference method and the question of whether we can obtain an efficient, deterministic variant to optimize the obtained inferences. We show that the problem is NP-hard and, in fact, Max-SNP complete; that the reduction creates problem instances conforming to a severe restriction believed to hold in real data (Clark, 1990); and that even if we first use a natural exponential-time operation, the remaining optimization problem is NP-hard. However, we also develop, implement, and test an approach based on that operation and (integer) linear programming. The approach works quickly and correctly on simulated data.
机译:人类基因组学的下一个阶段将涉及对人群进行大规模筛查,以发现显着的DNA多态性,尤其是单核苷酸多态性(SNP)。密集的人类SNP地图目前正在建设中。但是,这些地图和屏幕的实用性将受到以下事实的限制:人类是二倍体,并且目前很难获得有关两个“副本”的单独数据。因此,将收集基因型(混合)SNP数据,然后必须(部分)推断所需的单倍型(分区)数据。 Clark(1990)提出并研究了一种特殊的非确定性推理算法,并由Clark等人广泛使用。 (1998)。在本文中,我们将更加仔细地研究该推理方法以及是否可以获取有效的确定性变量以优化所获得的推理的问题。我们证明问题是NP难的,实际上Max-SNP是完整的;减少会导致出现符合严格限制的问题实例,而这种严格限制被认为存在于实际数据中(Clark,1990);即使我们首先使用自然的指数时间运算,剩下的优化问题仍然是NP-hard。但是,我们还基于该操作和(整数)线性编程来开发,实现和测试一种方法。该方法可在模拟数据上快速正确地工作。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号