...
首页> 外文期刊>BMC Bioinformatics >A double classification tree search algorithm for index SNP selection
【24h】

A double classification tree search algorithm for index SNP selection

机译:用于索引SNP选择的双分类树搜索算法

获取原文
           

摘要

Background In population-based studies, it is generally recognized that single nucleotide polymorphism (SNP) markers are not independent. Rather, they are carried by haplotypes, groups of SNPs that tend to be coinherited. It is thus possible to choose a much smaller number of SNPs to use as indices for identifying haplotypes or haplotype blocks in genetic association studies. We refer to these characteristic SNPs as index SNPs. In order to reduce costs and work, a minimum number of index SNPs that can distinguish all SNP and haplotype patterns should be chosen. Unfortunately, this is an NP-complete problem, requiring brute force algorithms that are not feasible for large data sets. Results We have developed a double classification tree search algorithm to generate index SNPs that can distinguish all SNP and haplotype patterns. This algorithm runs very rapidly and generates very good, though not necessarily minimum, sets of index SNPs, as is to be expected for such NP-complete problems. Conclusions A new algorithm for index SNP selection has been developed. A webserver for index SNP selection is available at http://cognia.cu-genome.org/cgi-bin/genome/snpIndex.cgi/
机译:背景技术在基于人群的研究中,通常认为单核苷酸多态性(SNP)标记不是独立的。相反,它们是由单倍型携带的,这些单倍型倾向于趋于一致。因此,有可能选择数量少得多的SNP作为遗传关联研究中鉴定单倍型或单倍型区的指标。我们将这些特征性SNP称为索引SNP。为了减少成本和工作,应选择能够区分所有SNP和单倍型模式的最少数量的SNP。不幸的是,这是一个NP完全问题,需要对大型数据集不可行的蛮力算法。结果我们开发了一种双分类树搜索算法,以生成可以区分所有SNP和单倍型模式的索引SNP。该算法运行非常迅速,并且生成了非常好的(但不一定是最少的)索引SNP集,正如此类NP完全问题所期望的那样。结论已经开发了一种新的索引SNP选择算法。可以从http://cognia.cu-genome.org/cgi-bin/genome/snpIndex.cgi/获得用于索引SNP选择的网络服务器。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号