首页> 外文会议>Asia-Pacific Bioinformatics Conference >The choice of null distributions for detecting gene-gene interactions in genome-wide association studies
【24h】

The choice of null distributions for detecting gene-gene interactions in genome-wide association studies

机译:用于检测基因组关联研究中基因基因相互作用的NULL分布

获取原文

摘要

Background: In genome-wide association studies (GWAS), the number of single-nucleotide polymorphisms (SNPs) typically ranges between 500,000 and 1,000,000. Accordingly, detecting gene-gene interactions in GWAS is computationally challenging because itinvolves hundreds of billions of SNP pairs. Stage-wise strategies are often used to overcome the computational difficulty. In the first stage, fast screening methods (e.g. Tuning ReliefF) are applied to reduce the whole SNP set to a small subset. In thesecond stage, sophisticated modeling methods (e.g., multifactor-dimensionality reduction (MDR)) are applied to the subset of SNPs to identify interesting interaction models and the corresponding interaction patterns. In the third stage, the significanceof the identified interaction patterns is evaluated by hypothesis testing. Results: In this paper, we show that this stage-wise strategy could be problematic in controlling the false positive rate if the null distribution is not appropriately chosen. This is because screening and modeling may change the null distribution used in hypothesis testing. In our simulation study, we use some popular screening methods and the popular modeling method MDR as examples to show the effect of the inappropriate choice of null distributions. To choose appropriate null distributions, we suggest to use the permutation test or testing on the independent data set. We demonstrate their performance using synthetic data and a real genome wide data set from an Aged-related Macular Degeneration (AMD) study. Conclusions: The permutation test or testing on the independent data set can help choosing appropriate null distributions in hypothesis testing, which provides more reliable results in practice.
机译:背景:在基因组 - 宽协会研究(GWAS)中,单核苷酸多态性(SNP)的数量通常在500,000和1,000,000之间。因此,检测GWAS中的基因 - 基因相互作用是在计算上具有挑战性的,因为它是数百十亿的SNP对。舞台明智的策略通常用于克服计算难度。在第一阶段,应用快速筛选方法(例如调谐refieff)以将整个SNP设置为小子集。在第二阶段,复杂的建模方法(例如,多因素 - 维数减少(MDR))应用于SNP的子集,以识别有趣的交互模型和相应的交互模式。在第三阶段,通过假设检测评估所识别的相互作用模式的重要性。 Results: In this paper, we show that this stage-wise strategy could be problematic in controlling the false positive rate if the null distribution is not appropriately chosen.这是因为筛选和建模可以改变假设检测中使用的空分布。 In our simulation study, we use some popular screening methods and the popular modeling method MDR as examples to show the effect of the inappropriate choice of null distributions.要选择适当的NULL分发,我们建议使用独立数据集的排列测试或测试。我们使用合成数据和来自与年龄相关的黄斑变性(AMD)研究的真实基因组宽数据展示其性能。结论:独立数据集的置换测试或测试可以帮助选择假设检测中的适当空分布,这在实践中提供了更可靠的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号