首页> 美国卫生研究院文献>Wiley-Blackwell Online Open >Reprioritizing Genetic Associations in Hit Regions Using LASSO-Based Resample Model Averaging
【2h】

Reprioritizing Genetic Associations in Hit Regions Using LASSO-Based Resample Model Averaging

机译:使用基于LASSO的重采样模型平均对命中区域的遗传关联进行优先排序

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Significance testing one SNP at a time has proven useful for identifying genomic regions that harbor variants affecting human disease. But after an initial genome scan has identified a “hit region” of association, single-locus approaches can falter. Local linkage disequilibrium (LD) can make both the number of underlying true signals and their identities ambiguous. Simultaneous modeling of multiple loci should help. However, it is typically applied ad hoc: conditioning on the top SNPs, with limited exploration of the model space and no assessment of how sensitive model choice was to sampling variability. Formal alternatives exist but are seldom used. Bayesian variable selection is coherent but requires specifying a full joint model, including priors on parameters and the model space. Penalized regression methods (e.g., LASSO) appear promising but require calibration, and, once calibrated, lead to a choice of SNPs that can be misleadingly decisive. We present a general method for characterizing uncertainty in model choice that is tailored to reprioritizing SNPs within a hit region under strong LD. Our method, LASSO local automatic regularization resample model averaging (LLARRMA), combines LASSO shrinkage with resample model averaging and multiple imputation, estimating for each SNP the probability that it would be included in a multi-SNP model in alternative realizations of the data. We apply LLARRMA to simulations based on case-control genome-wide association studies data, and find that when there are several causal loci and strong LD, LLARRMA identifies a set of candidates that is enriched for true signals relative to single locus analysis and to the recently proposed method of Stability Selection. Genet. Epidemiol. 36:451–462, 2012. © 2012 Wiley Periodicals, Inc.
机译:一次一次测试一个SNP的重要性已被证明可用于识别具有影响人类疾病的变异的基因组区域。但是,在最初的基因组扫描确定了关联的“命中区域”之后,单基因座方法可能会步履蹒跚。局部连锁不平衡(LD)可以使基础真实信号的数量及其身份不明确。多个基因座的同时建模应该会有所帮助。但是,它通常是临时应用的:对顶部SNP进行条件处理,对模型空间的探索有限,并且没有评估模型选择对采样变异性的敏感程度。存在形式上的替代方案,但很少使用。贝叶斯变量选择是连贯的,但是需要指定一个完整的联合模型,包括参数和模型空间的先验。惩罚性回归方法(例如LASSO)看起来很有希望,但需要校准,并且一旦校准,就会导致选择可能具有误导性的SNP。我们提出了一种表征模型选择中不确定性的通用方法,该方法专门用于在强LD下对命中区域内的SNP重新排序。我们的方法LASSO本地自动正则化重采样模型平均(LLARRMA)将LASSO收缩与重采样模型平均和多次插值相结合,为每个SNP估计在数据的替代实现中将其包含在多SNP模型中的可能性。我们将LLARRMA应用于基于病例对照全基因组关联研究数据的模拟中,发现当存在多个因果基因座和强LD时,LLARRMA会识别出一组候选物,这些候选物相对于单基因座分析和相对于真实基因分析而言,可以充实真实信号最近提出的稳定性选择方法。基因流行病。 36:451–462,2012.©2012 Wiley Periodicals,Inc.

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号