首页> 外文期刊>Human Molecular Genetics >Evaluation of genome-wide power of genetic association studies based on empirical data from the HapMap project.
【24h】

Evaluation of genome-wide power of genetic association studies based on empirical data from the HapMap project.

机译:基于HapMap项目的经验数据,评估遗传关联研究的全基因组能力。

获取原文
获取原文并翻译 | 示例
           

摘要

With recent advances in high-throughput single nucleotide polymorphism (SNP) typing technologies, genome-wide association studies have become a realistic approach to identify the causative genes that are responsible for common diseases of complex genetic traits. In this strategy, a trade-off between the increased genome coverage and a chance of finding SNPs incidentally showing a large statistics becomes serious due to extreme multiple-hypothesis testing. We investigated the extent to which this trade-off limits the genome-wide power with this approach by simulating a large number of case-control panels based on the empirical data from the HapMap Project. In our simulations, statistical costs of multiple hypothesis testing were evaluated by empirically calculating distributions of the maximum value of the chi(2) statistics for a series of marker sets having increasing numbers of SNPs, which were used to determine a genome-wide threshold in the following power simulations. With a practical study size, the cost of multiple testing largely offsets the potential benefits from increased genome coverage given modest genetic effects and/or low frequencies of causal alleles. In most realistic scenarios, increasing genome coverage becomes less influential on the power, while sample size is the predominant determinant of the feasibility of genome-wide association tests. Increasing genome coverage without corresponding increase in sample size will only consume resources without little gain in power. For common causal alleles with relatively large effect sizes [genotype relative risk > or =1.7], we can expect satisfactory power with currently available large-scale genotyping platforms using realistic sample size ( approximately 1000 per arm).
机译:随着高通量单核苷酸多态性(SNP)分型技术的最新进展,全基因组关联研究已成为一种鉴定引起复杂遗传特征常见疾病的致病基因的现实方法。在这种策略中,由于极端的多重假设测试,在增加的基因组覆盖范围和偶然发现大数据统计的SNP的机会之间进行权衡变得非常重要。我们通过基于HapMap项目的经验数据模拟了大量的病例对照面板,研究了这种折衷在多大程度上限制了这种方法在全基因组范围内的功能。在我们的模拟中,通过经验计算一系列具有增加的SNP的标记集的chi(2)统计量最大值的分布,评估了多个假设检验的统计成本,这些标记集用于确定SNP的全基因组阈值以下电源仿真。通过实际的研究规模,考虑到适度的遗传效应和/或因果等位基因的频率较低,多次测试的成本将大大抵消基因组覆盖率增加的潜在好处。在最现实的情况下,增加基因组覆盖范围对功率的影响较小,而样本大小是全基因组关联测试可行性的主要决定因素。在不相应增加样本量的情况下增加基因组覆盖率只会消耗资源,而不会获得多少功率。对于效应大小相对较大的常见因果等位基因[基因型相对风险>或= 1.7],我们可以使用现实的样本量(每臂约1000个),使用当前可用的大规模基因分型平台,获得令人满意的功效。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号