首页> 美国卫生研究院文献>Journal of Computational Biology >An Efficient Nonlinear Regression Approach for Genome-wide Detection of Marginal and Interacting Genetic Variations
【2h】

An Efficient Nonlinear Regression Approach for Genome-wide Detection of Marginal and Interacting Genetic Variations

机译:一种全基因组边缘和相互作用遗传变异全基因组检测的有效非线性回归方法

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

>Genome-wide association studies have revealed individual genetic variants associated with phenotypic traits such as disease risk and gene expressions. However, detecting pairwise interaction effects of genetic variants on traits still remains a challenge due to a large number of combinations of variants (∼1011 SNP pairs in the human genome), and relatively small sample sizes (typically <104). Despite recent breakthroughs in detecting interaction effects, there are still several open problems, including: (1) how to quickly process a large number of SNP pairs, (2) how to distinguish between true signals and SNPs/SNP pairs merely correlated with true signals, (3) how to detect nonlinear associations between SNP pairs and traits given small sample sizes, and (4) how to control false positives. In this article, we present a unified framework, called SPHINX, which addresses the aforementioned challenges. We first propose a piecewise linear model for interaction detection, because it is simple enough to estimate model parameters given small sample sizes but complex enough to capture nonlinear interaction effects. Then, based on the piecewise linear model, we introduce randomized group lasso under stability selection, and a screening algorithm to address the statistical and computational challenges mentioned above. In our experiments, we first demonstrate that SPHINX achieves better power than existing methods for interaction detection under false positive control. We further applied SPHINX to late-onset Alzheimer's disease dataset, and report 16 SNPs and 17 SNP pairs associated with gene traits. We also present a highly scalable implementation of our screening algorithm, which can screen ∼118 billion candidates of associations on a 60-node cluster in <5.5 hours.
机译:>全基因组关联研究已经揭示了与表型性状(如疾病风险和基因表达)相关的个体遗传变异。然而,由于大量的变体组合(人类基因组中约10 11 个SNP对),以及相对较小的样本量(通常是),检测遗传变体对性状的成对相互作用影响仍然是一个挑战。 <10 4 )。尽管最近在检测交互作用方面取得了突破,但仍然存在一些未解决的问题,包括:(1)如何快速处理大量SNP对,(2)如何区分真实信号和仅与真实信号相关的SNP / SNP对,(3)如何在样本量较小的情况下检测SNP对与性状之间的非线性关联,以及(4)如何控制假阳性。在本文中,我们提出了一个名为SPHINX的统一框架,该框架解决了上述挑战。我们首先提出一种用于交互检测的分段线性模型,因为在给定较小样本量的情况下,它足够简单来估计模型参数,而对于捕获非线性交互作用则足够复杂。然后,基于分段线性模型,我们引入了稳定性选择下的随机组套索,以及一种针对上述统计和计算难题的筛选算法。在我们的实验中,我们首先证明SPHINX在误报控制下比现有的交互检测方法具有更好的功能。我们进一步将SPHINX应用于晚期阿尔茨海默氏病数据集,并报告与基因性状相关的16个SNP和17个SNP对。我们还提供了筛选算法的高度可扩展的实现,该筛选算法可以在<5.5小时内筛选出60个节点的群集上的约1,180亿个关联候选。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号