...
首页> 外文期刊>Bioinformatics >Genome-wide selection of tag SNPs using multiple-marker correlation
【24h】

Genome-wide selection of tag SNPs using multiple-marker correlation

机译:使用多标记相关性在全基因组范围内选择标签SNP

获取原文
获取原文并翻译 | 示例
           

摘要

MOTIVATIONS: The tag SNP approach is a valuable tool in whole genome association studies, and a variety of algorithms have been proposed to identify the optimal tag SNP set. Currently, most tag SNP selection is based on two-marker (pairwise) linkage disequilibrium (LD). Recent literature has shown that multiple-marker LD also contains useful information that can further increase the genetic coverage of the tag SNP set. Thus, tag SNP selection methods that incorporate multiple-marker LD are expected to have advantages in terms of genetic coverage and statistical power. RESULTS: We propose a novel algorithm to select tag SNPs in an iterative procedure. In each iteration loop, the SNP that captures the most neighboring SNPs (through pair-wise and multiple-marker LD) is selected as a tag SNP. We optimize the algorithm and computer program to make our approach feasible on today's typical workstations. Benchmarked using HapMap release 21, our algorithm outperforms standard pair-wise LD approach in several aspects. (i) It improves genetic coverage (e.g. by 7.2% for 200 K tag SNPs in HapMap CEU) compared to its conventional pair-wise counterpart, when conditioning on a fixed tag SNP number. (ii) It saves genotyping costs substantially when conditioning on fixed genetic coverage (e.g. 34.1% saving in HapMap CEU at 90% coverage). (iii) Tag SNPs identified using multiple-marker LD have good portability across closely related ethnic groups and (iv) show higher statistical power in association tests than those selected using conventional methods. AVAILABILITY: A computer software suite, multiTag, has been developed based on this novel algorithm. The program is freely available by written request to the author at ke_hao@merck.com
机译:动机:在整个基因组关联研究中,标签SNP方法是一种有价值的工具,并且已经提出了多种算法来识别最佳标签SNP集。当前,大多数标签SNP选择基于两个标记(成对)连锁不平衡(LD)。最近的文献表明,多标记LD还包含有用的信息,可以进一步增加标签SNP集的遗传覆盖率。因此,期望包含多标记LD的标签SNP选择方法在遗传覆盖率和统计能力方面具有优势。结果:我们提出了一种新颖的算法来选择标记SNPs的迭代过程中。在每个迭代循环中,捕获最邻近SNP(通过成对和多标记LD)的SNP被选择为标记SNP。我们优化算法和计算机程序,使我们的方法在当今的典型工作站上可行。使用HapMap 21版进行基准测试,我们的算法在几个方面都优于标准的成对LD方法。 (i)当以固定标签SNP编号为条件时,与传统的成对配对方法相比,它可以提高遗传覆盖率(例如,HapMap CEU中200 K标签SNP的遗传覆盖率提高7.2%)。 (ii)当以固定的遗传覆盖率为条件时,它可以节省大量的基因分型成本(例如,在覆盖率90%的情况下,HapMap CEU可以节省34.1%)。 (iii)使用多标记LD识别的标签SNP在密切相关的种族中具有良好的可移植性;并且(iv)在关联测试中显示出比使用传统方法选择的更高的统计能力。可用性:已基于此新颖算法开发了计算机软件套件multiTag。可通过书面请求ke_hao@merck.com免费获得该程序。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号