首页> 美国卫生研究院文献>Bioinformatics >Assessing statistical significance in multivariable genome wide association analysis
【2h】

Assessing statistical significance in multivariable genome wide association analysis

机译:在多变量全基因组关联分析中评估统计意义

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

>Motivation: Although Genome Wide Association Studies (GWAS) genotype a very large number of single nucleotide polymorphisms (SNPs), the data are often analyzed one SNP at a time. The low predictive power of single SNPs, coupled with the high significance threshold needed to correct for multiple testing, greatly decreases the power of GWAS.>Results: We propose a procedure in which all the SNPs are analyzed in a multiple generalized linear model, and we show its use for extremely high-dimensional datasets. Our method yields P-values for assessing significance of single SNPs or groups of SNPs while controlling for all other SNPs and the family wise error rate (FWER). Thus, our method tests whether or not a SNP carries any additional information about the phenotype beyond that available by all the other SNPs. This rules out spurious correlations between phenotypes and SNPs that can arise from marginal methods because the ‘spuriously correlated’ SNP merely happens to be correlated with the ‘truly causal’ SNP. In addition, the method offers a data driven approach to identifying and refining groups of SNPs that jointly contain informative signals about the phenotype. We demonstrate the value of our method by applying it to the seven diseases analyzed by the Wellcome Trust Case Control Consortium (WTCCC). We show, in particular, that our method is also capable of finding significant SNPs that were not identified in the original WTCCC study, but were replicated in other independent studies.>Availability and implementation: Reproducibility of our research is supported by the open-source Bioconductor package hierGWAS.>Contact: >Supplementary information: are available at Bioinformatics online.
机译:>动机:尽管全基因组关联研究(GWAS)的基因型具有大量的单核苷酸多态性(SNP),但是通常一次只能分析一个SNP。单个SNP的低预测能力,加上校正多次测试所需的高显着性阈值,大大降低了GWAS的能力。>结果:我们提出了一种程序,其中所有SNP都在一个分析中进行了分析。多元广义线性模型,我们将其用于极高维数据集。我们的方法产生P值,用于评估单个SNP或一组SNP的重要性,同时控制所有其他SNP和家族明智错误率(FWER)。因此,我们的方法测试了SNP是否携带了除所有其他SNP可用的表型以外的任何其他信息。这排除了边缘方法可能引起的表型和SNP之间的虚假关联,因为“虚假相关” SNP恰好与“真正因果” SNP相关。此外,该方法提供了一种数据驱动的方法来识别和完善SNP组,这些SNP组共同包含有关表型的信息性信号。通过将其应用于惠康信托案例控制协会(WTCCC)分析的七种疾病,我们证明了该方法的价值。我们特别表明,我们的方法还能够找到在原始WTCCC研究中未发现但在其他独立研究中重复的重要SNP。>可用性和实施​​:我们研究的可重复性是>联系方式: >补充信息:可从在线生物信息学获得。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号