首页> 外文期刊>BMC Medical Genomics >Calculating the statistical significance of rare variants causal for Mendelian and complex disorders
【24h】

Calculating the statistical significance of rare variants causal for Mendelian and complex disorders

机译:计算因孟德尔和复杂疾病而导致的罕见变异的统计意义

获取原文
获取外文期刊封面目录资料

摘要

With the expanding use of next-gen sequencing (NGS) to diagnose the thousands of rare Mendelian genetic diseases, it is critical to be able to interpret individual DNA variation. To calculate the significance of finding a rare protein-altering variant in a given gene, one must know the frequency of seeing a variant in the general population that is at least as damaging as the variant in question. We developed a general method to better interpret the likelihood that a rare variant is disease causing if observed in a given gene or genic region mapping to a described protein domain, using genome-wide information from a large control sample. Based on data from 2504 individuals in the 1000 Genomes Project dataset, we calculated the number of individuals who have a rare variant in a given gene for numerous filtering threshold scenarios, which may be used for calculating the significance of an observed rare variant being causal for disease. Additionally, we calculated mutational burden data on the number of individuals with rare variants in genic regions mapping to protein domains. We describe methods to use the mutational burden data for calculating the significance of observing rare variants in a given proportion of sequenced individuals. We present SORVA, an implementation of these methods as a web tool, and we demonstrate application to 20 relevant but diverse next-gen sequencing studies. Specifically, we calculate the statistical significance of findings involving multi-family studies with rare Mendelian disease and a large-scale study of a complex disorder, autism spectrum disorder. If we use the frequency counts to rank genes based on intolerance for variation, the ranking correlates well with pLI scores derived from the Exome Aggregation Consortium (ExAC) dataset (ρ?=?0.515), with the benefit that the scores are directly interpretable. We have presented a strategy that is useful for vetting candidate genes from NGS studies and allows researchers to calculate the significance of seeing a variant in a given gene or protein domain. This approach is an important step towards developing a quantitative, statistics-based approach for presenting clinical findings.
机译:随着下一代测序(NGS)的广泛使用来诊断成千上万的孟德尔遗传病,能够解释单个DNA变异至关重要。为了计算在给定基因中发现稀有蛋白质改变变体的重要性,人们必须知道在普通人群中看到变体的频率至少与所讨论的变体一样有害。我们开发了一种通用方法,可使用来自大型对照样品的全基因组信息,更好地解释罕见变异在疾病中引起疾病​​的可能性,如果在给定基因或基因区域映射到所描述的蛋白质结构域观察到该变异。根据1000个基因组计划数据集中的2504个个体的数据,我们计算了在多种过滤阈值情况下,给定基因中具有稀有变异体的个体数量,可用于计算观察到的稀有变异体对疾病的因果关系疾病。此外,我们计算了映射到蛋白质结构域的基因区域中具有罕见变异的个体数量的突变负荷数据。我们描述了使用突变负担数据来计算在给定比例的测序个体中观察稀有变体的重要性的方法。我们介绍SORVA,将这些方法作为Web工具的一种实现,并演示了其在20种相关但多样的下一代测序研究中的应用。具体来说,我们计算发现的统计显着性,涉及具有罕见孟德尔病的多家族研究和复杂障碍,自闭症谱系障碍的大规模研究。如果我们使用频率计数根据变异的不容忍度对基因进行排名,则该排名与从外显子组聚合协会(ExAC)数据集(ρ?=?0.515)得出的pLI得分具有很好的相关性,其优点是这些得分可以直接解释。我们提出了一种策略,可用于审核NGS研究中的候选基因,并使研究人员能够计算出在给定的基因或蛋白质域中看到变体的重要性。该方法是开发用于表示临床发现的基于统计的定量方法的重要步骤。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号