首页> 外文期刊>Journal of computational biology: A journal of computational molecular cell biology >Efficiently Identifying Significant Associations in Genome-wide Association Studies
【24h】

Efficiently Identifying Significant Associations in Genome-wide Association Studies

机译:在全基因组关联研究中有效识别重要关联

获取原文
获取原文并翻译 | 示例
获取外文期刊封面目录资料

摘要

Over the past several years, genome-wide association studies (GWAS) have implicated hundreds of genes in common disease. More recently, the GWAS approach has been utilized to identify regions of the genome that harbor variation affecting gene expression or expression quantitative trait loci (eQTLs). Unlike GWAS applied to clinical traits, where only a handful of phenotypes are analyzed per study, in eQTL studies, tens of thousands of gene expression levels are measured, and the GWAS approach is applied to each gene expression level. This leads to computing billions of statistical tests and requires substantial computational resources, particularly when applying novel statistical methods such as mixed models. We introduce a novel two-stage testing procedure that identifies all of the significant associations more efficiently than testing all the single nucleotide polymorphisms (SNPs). In the first stage, a small number of informative SNPs, or proxies, across the genome are tested. Based on their observed associations, our approach locates the regions that may contain significant SNPs and only tests additional SNPs from those regions. We show through simulations and analysis of real GWAS datasets that the proposed two-stage procedure increases the computational speed by a factor of 10. Additionally, efficient implementation of our software increases the computational speed relative to the state-of-the-art testing approaches by a factor of 75.
机译:在过去的几年中,全基因组关联研究(GWAS)已经牵涉到数百种常见疾病的基因。最近,GWAS方法已被用来识别基因组中具有影响基因表达或表达定量性状基因座(eQTL)的变异的区域。与应用于临床特征的GWAS不同,每个研究仅分析少数表型,在eQTL研究中,测量了成千上万的基因表达水平,并且将GWAS方法应用于每个基因表达水平。这导致计算数十亿的统计检验,并且需要大量的计算资源,尤其是在应用新颖的统计方法(例如混合模型)时。我们介绍了一种新颖的两阶段测试程序,该程序比测试所有单核苷酸多态性(SNP)更有效地识别所有重要关联。在第一阶段,对整个基因组中的少量信息性SNP或代理进行测试。基于他们观察到的关联,我们的方法确定了可能包含重要SNP的区域,并且仅测试这些区域中的其他SNP。通过对实际GWAS数据集的仿真和分析,我们发现,建议的两步过程将计算速度提高了10倍。此外,相对于最新的测试方法,我们软件的有效实施提高了计算速度75倍

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号