FastANOVA: an Efficient Algorithm for Genome-Wide Association Study

机译：FastANOVA：用于全基因组关联研究的高效算法

获取原文

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Studying the association between quantitative phenotype (such as height or weight) and single nucleotide polymorphisms (SNPs) is an important problem in biology. To understand underlying mechanisms of complex phenotypes, it is often necessary to consider joint genetic effects across multiple SNPs. ANOVA (analysis of variance) test is routinely used in association study. Important findings from studying gene-gene (SNP-pair) interactions are appearing in the literature. However, the number of SNPs can be up to millions. Evaluating joint effects of SNPs is a challenging task even for SNP-pairs. Moreover, with large number of SNPs correlated, permutation procedure is preferred over simple Bonferroni correction for properly controlling family-wise error rate and retaining mapping power, which dramatically increases the computational cost of association study.In this paper, we study the problem of finding SNP-pairs that have significant associations with a given quantitative phenotype. We propose an efficient algorithm, FastANOVA, for performing ANOVA tests on SNP-pairs in a batch mode, which also supports large permutation test. We derive an upper bound of SNP-pair ANOVA test, which can be expressed as the sum of two terms. The first term is based on single-SNP ANOVA test. The second term is based on the SNPs and independent of any phenotype permutation. Furthermore, SNP-pairs can be organized into groups, each of which shares a common upper bound. This allows for maximum reuse of intermediate computation, efficient upper bound estimation, and effective SNP-pair pruning. Consequently, FastANOVA only needs to perform the ANOVA test on a small number of candidate SNP-pairs without the risk of missing any significant ones. Extensive experiments demonstrate that FastANOVA is orders of magnitude faster than the brute-force implementation of ANOVA tests on all SNP pairs.

机译：研究定量表型（例如身高或体重）与单核苷酸多态性（SNP）之间的关联是生物学中的重要问题。要了解复杂表型的潜在机制，通常有必要考虑跨多个SNP的联合遗传效应。关联研究通常使用ANOVA（方差分析）测试。研究基因-基因（SNP对）相互作用的重要发现出现在文献中。但是，SNP的数量可能高达数百万。即使对于SNP对，评估SNP的联合作用也是一项艰巨的任务。此外，由于有大量的SNP相关联，因此置换程序比简单的Bonferroni校正更可取，以适当地控制族错误率并保持映射能力，这大大增加了关联研究的计算成本。在本文中，我们研究发现与给定定量表型有显着关联的SNP对的问题。我们提出了一种高效的算法FastANOVA，用于以批处理模式对SNP对执行ANOVA测试，该算法还支持大型置换测试。我们推导了SNP对ANOVA检验的上限，可以将其表示为两个项之和。第一项基于单SNP方差分析测试。第二项基于SNP，并且独立于任何表型排列。此外，SNP对可以组织成组，每个组共享一个共同的上限。这允许最大程度地重用中间计算，有效的上限估计和有效的SNP对修剪。因此，FastANOVA只需要对少量候选SNP对进行ANOVA测试，而不会遗漏任何重要的对。大量实验表明，在所有SNP对上，FastANOVA比ANOVA测试的强力实施快几个数量级。

著录项

来源
《ACMKDD International Conference on Knowledge Discovery and Data Mining;KDD 2008》|2008年|803-811|共9页
会议地点
作者
Xiang Zhang; Fei Zou; Wei Wang;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类信息与知识传播;
关键词
association study; ANOVA test;

机译：关联研究;方差分析测试;

相似文献

外文文献
中文文献
专利

1. Efficient multivariate linear mixed model algorithms for genome-wide association studies [J] . Xiang Zhou, Matthew Stephens Nature methods . 2014,第4期

机译：用于全基因组关联研究的高效多元线性混合模型算法
2. Efficient algorithms for genome-wide association study [J] . Amos Olagunju Computing reviews . 2010,第9期

机译：用于全基因组关联研究的高效算法
3. An efficient algorithm to explore liquid association on a genome-wide scale [J] . Tina Gunderson, Yen-Yi Ho BMC Bioinformatics . 2014,第1期

机译：在全基因组范围内探索液体缔合的有效算法
4. FastANOVA: an Efficient Algorithm for Genome-Wide Association Study [C] . ACMKDD International Conference on Knowledge Discovery and Data Mining . 2008

机译：Fastanova：一种高效的基因组协会研究算法
5. Efficient Algorithms for Detecting Genetic Interactions in Genome-Wide Association Study. [D] . Zhang, Xiang. 2011

机译：全基因组关联研究中检测遗传相互作用的高效算法。
6. FastANOVA: an Efficient Algorithm for Genome-Wide Association Study [O] . Xiang Zhang, Fei Zou, Wei Wang -1

机译：FastANOVA：全基因组关联研究的高效算法
7. Efficient multivariate linear mixed model algorithms for genome-wide association studies [O] . Xiang Zhou, Matthew Stephens 2014

机译：基因组关联研究有效多变量线性混合模型算法
8. Genome-Wide Association Mapping for Intelligence in Military Working Dogs: Canine Cohort, Canine Intelligence Assessment Regimen, Genome-Wide Single Nucleotide Polymorphism (SNP) Typing, and Unsupervised Classification Algorithm for Genome-Wide Association Data Analysis [R] . Chan, V. T., Mauzy, C. A., Soto, A., 2011

机译：军事工作犬智力的全基因组关联图谱：犬群，犬智力评估方案，全基因组单核苷酸多态性（sNp）分型和无监督分类算法的全基因组关联数据分析

FastANOVA: an Efficient Algorithm for Genome-Wide Association Study

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅