首页> 美国卫生研究院文献>other >The Weighting Is The Hardest Part: On The Behavior of the LikelihoodRatio Test and the Score Test Under a Data-Driven Weighting Scheme in SequencedSamples
【2h】

The Weighting Is The Hardest Part: On The Behavior of the LikelihoodRatio Test and the Score Test Under a Data-Driven Weighting Scheme in SequencedSamples

机译:权重是最难的部分:关于可能性的行为数据驱动加权方案下的比率检验和得分检验样品

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Sequence-based association studies are at a critical inflexion point with the increasing availability of exome-sequencing data. A popular test of association is the sequence kernel association test (SKAT). Weights are embedded within SKAT to reflect the hypothesized contribution of the variants to the trait variance. Because the true weights are generally unknown, and so are subject to misspecification, we examined the efficiency of a data-driven weighting scheme.We propose the use of a set of theoretically defensible weighting schemes, of which, we assume, the one that gives the largest test statistic is likely to capture best the allele frequency-functional effect relationship. We show that the use of alternative weights obviates the need to impose arbitrary frequency thresholds in sequence data association analyses. As both the score test and the likelihood ratio test (LRT) may be used in this context, and may differ in power, we characterize the behavior of both tests.We found that the two tests have equal power if the set of weights resembled the correct ones. However, if the weights are badly specified, the LRT shows superior power (due to its robustness to misspecification). With this data-driven weighting procedure the LRT detected significant signal in genes located in regions already confirmed as associated with schizophrenia –the PRRC2A (P=1.020E-06) and the VARS2(P=2.383E-06) – in the Swedish schizophrenia case-control cohortof 11,040 individuals with exome-sequencing data.The score test is currently preferred for its computational efficiencyand power. Indeed, assuming correct specification, in some circumstances thescore test is the most powerful. However, LRT has the advantageous properties ofbeing generally more robust and more powerful under weight misspecification.This is an important result given that, arguably, misspecified models are likelyto be the rule rather than the exception in weighting-based approaches.
机译:随着外显子组测序数据可用性的提高,基于序列的关联研究正处于关键的转折点。流行的关联测试是序列内核关联测试(SKAT)。权重嵌入在SKAT中,以反映变体对特征方差的假设贡献。由于真正的权重通常是未知的,因此容易出错,因此我们研究了数据驱动权重方案的效率。我们建议使用一组理论上可行的权重方案,我们假设其中一种最大的测试统计数据可能会最好地捕获等位基因频率-功能效应的关系。我们表明,使用替代权重可以避免在序列数据关联分析中强加任意频率阈值的需求。由于得分测试和似然比测试(LRT)可能都在这种情况下使用,并且功效可能有所不同,因此我们对这两个测试的行为进行了表征。我们发现,如果一组权重类似于正确的。但是,如果权重指定不当,则LRT将显示出更高的功率(由于其对错误指定的鲁棒性)。通过此数据驱动的加权程序,LRT在已经证实与精神分裂症相关的区域中的基因中检测到了重要信号,PRRC2A(P = 1.020E-06)和VARS2(P = 2.383E-06)–在瑞典精神分裂症病例对照队列中1,040名具有外显子组测序数据的人。得分测试目前因其计算效率而被首选和力量。实际上,假设规范正确,在某些情况下分数测试是最强大的。然而,轻铁具有以下优点在重量不合规格的情况下通常更坚固,功能更强大。鉴于可能会错误指定模型,这是一个重要的结果成为基于规则的方法而不是例外。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号