首页> 外文期刊>PLoS Genetics >Dissection of a Complex Disease Susceptibility Region Using a Bayesian Stochastic Search Approach to Fine Mapping
【24h】

Dissection of a Complex Disease Susceptibility Region Using a Bayesian Stochastic Search Approach to Fine Mapping

机译:使用贝叶斯随机搜索方法对复杂疾病易感性区域进行精细解剖

获取原文
       

摘要

Identification of candidate causal variants in regions associated with risk of common diseases is complicated by linkage disequilibrium (LD) and multiple association signals. Nonetheless, accurate maps of these variants are needed, both to fully exploit detailed cell specific chromatin annotation data to highlight disease causal mechanisms and cells, and for design of the functional studies that will ultimately be required to confirm causal mechanisms. We adapted a Bayesian evolutionary stochastic search algorithm to the fine mapping problem, and demonstrated its improved performance over conventional stepwise and regularised regression through simulation studies. We then applied it to fine map the established multiple sclerosis (MS) and type 1 diabetes (T1D) associations in the IL-2RA (CD25) gene region. For T1D, both stepwise and stochastic search approaches identified four T1D association signals, with the major effect tagged by the single nucleotide polymorphism, rs12722496. In contrast, for MS, the stochastic search found two distinct competing models: a single candidate causal variant, tagged by rs2104286 and reported previously using stepwise analysis; and a more complex model with two association signals, one of which was tagged by the major T1D associated rs12722496 and the other by rs56382813. There is low to moderate LD between rs2104286 and both rs12722496 and rs56382813 (r~(2)? 0:3) and our two SNP model could not be recovered through a forward stepwise search after conditioning on rs2104286. Both signals in the two variant model for MS affect CD25 expression on distinct subpopulations of CD4~(+)T cells, which are key cells in the autoimmune process. The results support a shared causal variant for T1D and MS. Our study illustrates the benefit of using a purposely designed model search strategy for fine mapping and the advantage of combining disease and protein expression data. Author Summary Genetic association studies have identified many DNA sequence variants that associate with disease risk. By exploiting the known correlation that exists between neighbouring variants in the genome, inference can be extended beyond those individual variants tested to identify sets within which a causal variant is likely to reside. However, this correlation, particularly in the presence of multiple disease causing variants in relative proximity, makes disentangling the specific causal variants difficult. Statistical approaches to this fine mapping problem have traditionally taken a stepwise search approach, beginning with the most associated variant in a region, then iteratively attempting to find additional associated variants. We adapted a stochastic search approach that avoids this stepwise process and is explicitly designed for dealing with highly correlated predictors to the fine mapping problem. We showed in simulated data that it outperforms its stepwise counterpart and other variable selection strategies such as the lasso. We applied our approach to understand the association of two immune-mediated diseases to a region on chromosome 10p15. We identified a model for multiple sclerosis containing two variants, neither of which was found through a stepwise search, and functionally linked both of these to the neighbouring candidate gene, IL2RA , in independent data. Our approach can be used to aid fine mapping of other disease-associated regions, which is critical for design of functional follow-up studies required to understand the mechanisms through which genetic variants influence disease.
机译:连锁不平衡(LD)和多个关联信号使与常见疾病风险相关区域的候选因果变异的识别变得复杂。尽管如此,仍需要这些变体的精确图谱,以充分利用详细的细胞特异性染色质注释数据来突出疾病的致病机制和细胞,以及最终确定致病机制所需的功能研究的设计。我们对精细映射问题采用了贝叶斯演化随机搜索算法,并通过仿真研究证明了其优于常规逐步回归和正则回归的性能。然后,我们将其应用于在IL-2RA(CD25)基因区域中建立的多发性硬化症(MS)和1型糖尿病(T1D)关联的精细图。对于T1D,逐步搜索和随机搜索方法均确定了四个T1D关联信号,其主要作用由单核苷酸多态性rs12722496标记。相比之下,对于MS,随机搜索发现了两个不同的竞争模型:单个候选因果变量,由rs2104286标记并先前使用逐步分析进行了报告;以及具有两个关联信号的更复杂的模型,其中一个信号由主要的T1D关联的rs12722496标记,另一个信号由rs56382813标记。在rs2104286与rs12722496和rs56382813之间都存在低到中等的LD(r〜(2)?0:3),并且在对rs2104286进行条件处理后,无法通过正向逐步搜索来恢复我们的两个SNP模型。 MS的两个变体模型中的两个信号都影响CD4〜(+)T细胞的不同亚群上的CD25表达,CD4〜(T)T细胞是自身免疫过程中的关键细胞。结果支持T1D和MS共享因果变量。我们的研究显示了使用专门设计的模型搜索策略进行精细定位的好处,以及结合疾病和蛋白质表达数据的好处。作者摘要遗传关联研究已鉴定出许多与疾病风险相关的DNA序列变异。通过利用基因组中相邻变体之间存在的已知相关性,可以将推论扩展到那些经过测试以识别可能存在因果变体的集合的测试个体变体之外。但是,这种相关性,特别是在存在多个相对接近的致病性变体的情况下,使得难以区分特定的因果变体。传统上,针对这种精细映射问题的统计方法采取了逐步搜索的方法,从一个区域中最相关的变体开始,然后反复尝试查找其他相关的变体。我们采用了一种随机搜索方法,该方法避免了此逐步过程,并且明确地设计用于处理与精细映射问题高度相关的预测变量。我们在模拟数据中显示,其性能优于其逐步对应方法和其他变量选择策略(如套索)。我们运用我们的方法来了解两种免疫介导的疾病与10p15染色体区域的关联。我们确定了一种包含两个变体的多发性硬化症模型,通过逐步搜索均未找到这两个变体,并且在独立数据中将这两个变体均与相邻候选基因IL2RA功能连接。我们的方法可用于辅助其他疾病相关区域的精细定位,这对于设计功能性后续研究至关重要,该研究需要了解遗传变异影响疾病的机制。

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号