首页> 美国卫生研究院文献>BMC Proceedings >Evaluating methods for the analysis of rare variants in sequence data
【2h】

Evaluating methods for the analysis of rare variants in sequence data

机译:分析序列数据中稀有变异的评估方法

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

A number of rare variant statistical methods have been proposed for analysis of the impending wave of next-generation sequencing data. To date, there are few direct comparisons of these methods on real sequence data. Furthermore, there is a strong need for practical advice on the proper analytic strategies for rare variant analysis. We compare four recently proposed rare variant methods (combined multivariate and collapsing, weighted sum, proportion regression, and cumulative minor allele test) on simulated phenotype and next-generation sequencing data as part of Genetic Analysis Workshop 17. Overall, we find that all analyzed methods have serious practical limitations on identifying causal genes. Specifically, no method has more than a 5% true discovery rate (percentage of truly causal genes among all those identified as significantly associated with the phenotype). Further exploration shows that all methods suffer from inflated false-positive error rates (chance that a noncausal gene will be identified as associated with the phenotype) because of population stratification and gametic phase disequilibrium between noncausal SNPs and causal SNPs. Furthermore, observed true-positive rates (chance that a truly causal gene will be identified as significantly associated with the phenotype) for each of the four methods was very low (<19%). The combination of larger than anticipated false-positive rates, low true-positive rates, and only about 1% of all genes being causal yields poor discriminatory ability for all four methods. Gametic phase disequilibrium and population stratification are important areas for further research in the analysis of rare variant data.
机译:已经提出了许多罕见的变异统计方法来分析即将来临的下一代测序数据。迄今为止,在实际序列数据上几乎没有这些方法的直接比较。此外,强烈需要针对稀有变异分析的正确分析策略提供实用建议。作为遗传分析研讨会17的一部分,我们对模拟表型和下一代测序数据比较了四种最近提出的稀有变异方法(组合多元分析和折叠,加权和,比例回归和累积较小等位基因检验)。总的来说,我们发现所有分析方法方法在鉴定因果基因方面有严重的实际限制。具体而言,没有一种方法的真实发现率超过5%(在所有与表型显着相关的基因中,真正的因果基因所占的百分比)。进一步的研究表明,由于非因果SNP与因果SNP之间的种群分层和配子期不平衡,所有方法都遭受虚假阳性率高涨的机会(可能会将非因果基因识别为与表型相关联)。此外,对于四种方法中的每一种,观察到的真阳性率(有可能将真正的因果基因识别为与表型显着相关的机会)非常低(<19%)。假阳性率高于预期,真阳性率低以及所有基因中只有约1%为因果的组合对这四种方法的判别能力都很差。配子阶段不平衡和种群分层是在稀有变异数据分析中进一步研究的重要领域。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号