首页> 美国卫生研究院文献>Springer Open Choice >Evaluating the effective numbers of independent tests and significant p-value thresholds in commercial genotyping arrays and public imputation reference datasets
【2h】

Evaluating the effective numbers of independent tests and significant p-value thresholds in commercial genotyping arrays and public imputation reference datasets

机译:在商业基因分型阵列和公共归因参考数据集中评估独立测试的有效数量和显着的p值阈值

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Current genome-wide association studies (GWAS) use commercial genotyping microarrays that can assay over a million single nucleotide polymorphisms (SNPs). The number of SNPs is further boosted by advanced statistical genotype-imputation algorithms and large SNP databases for reference human populations. The testing of a huge number of SNPs needs to be taken into account in the interpretation of statistical significance in such genome-wide studies, but this is complicated by the non-independence of SNPs because of linkage disequilibrium (LD). Several previous groups have proposed the use of the effective number of independent markers (Me) for the adjustment of multiple testing, but current methods of calculation for Me are limited in accuracy or computational speed. Here, we report a more robust and fast method to calculate Me. Applying this efficient method [implemented in a free software tool named Genetic type 1 error calculator (GEC)], we systematically examined the Me, and the corresponding p-value thresholds required to control the genome-wide type 1 error rate at 0.05, for 13 Illumina or Affymetrix genotyping arrays, as well as for HapMap Project and 1000 Genomes Project datasets which are widely used in genotype imputation as reference panels. Our results suggested the use of a p-value threshold of ~10−7 as the criterion for genome-wide significance for early commercial genotyping arrays, but slightly more stringent p-value thresholds ~5 × 10−8 for current or merged commercial genotyping arrays, ~10−8 for all common SNPs in the 1000 Genomes Project dataset and ~5 × 10−8 for the common SNPs only within genes.Electronic supplementary materialThe online version of this article (doi:10.1007/s00439-011-1118-2) contains supplementary material, which is available to authorized users.
机译:当前的全基因组关联研究(GWAS)使用商业基因分型微阵列,可以检测超过一百万个单核苷酸多态性(SNP)。先进的统计基因型输入算法和供参考人群使用的大型SNP数据库进一步提高了SNP的数量。在此类全基因组研究中,统计意义的解释需要考虑大量SNP的测试,但是由于连锁不平衡(LD),SNP的非独立性使这一问题变得复杂。先前的几个小组提出了使用有效数量的独立标记(Me)来调整多重测试的方法,但是当前的Me计算方法在准确性或计算速度上受到限制。在这里,我们报告了一种更可靠,更快速的方法来计算Me。应用这种有效方法[在名为遗传类型1错误计算器(GEC)的免费软件工具中实现],我们系统地检查了Me以及将全基因组1型错误率控制在0.05以下所需的相应p值阈值, 13 Illumina或Affymetrix基因分型阵列,以及HapMap Project和1000 Genomes Project数据集,这些数据集在基因型插补中广泛用作参考面板。我们的结果建议使用〜10 −7 的p值阈值作为早期商业基因分型阵列在全基因组意义上的标准,但p值阈值的严格程度应为〜5×10 < sup> -8 用于当前或合并的商业基因分型阵列,〜10 -8 用于1000个基因组计划数据集中的所有常见SNP和〜5×10 -8 >仅适用于基因内的常见SNP电子补充材料本文的在线版本(doi:10.1007 / s00439-011-1118-2)包含补充材料,授权用户可以使用。

著录项

相似文献

  • 外文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号