首页> 美国卫生研究院文献>Frontiers in Genetics >Computationally efficient permutation-based confidence interval estimation for tail-area FDR
【2h】

Computationally efficient permutation-based confidence interval estimation for tail-area FDR

机译:基于计算效率基于置换的尾部区域FDR置信区间估计

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Challenges of satisfying parametric assumptions in genomic settings with thousands or millions of tests have led investigators to combine powerful False Discovery Rate (FDR) approaches with computationally expensive but exact permutation testing. We describe a computationally efficient permutation-based approach that includes a tractable estimator of the proportion of true null hypotheses, the variance of the log of tail-area FDR, and a confidence interval (CI) estimator, which accounts for the number of permutations conducted and dependencies between tests. The CI estimator applies a binomial distribution and an overdispersion parameter to counts of positive tests. The approach is general with regards to the distribution of the test statistic, it performs favorably in comparison to other approaches, and reliable FDR estimates are demonstrated with as few as 10 permutations. An application of this approach to relate sleep patterns to gene expression patterns in mouse hypothalamus yielded a set of 11 transcripts associated with 24 h REM sleep [FDR = 0.15 (0.08, 0.26)]. Two of the corresponding genes, Sfrp1 and Sfrp4, are involved in wnt signaling and several others, Irf7, Ifit1, Iigp2, and Ifih1, have links to interferon signaling. These genes would have been overlooked had a typical a priori FDR threshold such as 0.05 or 0.1 been applied. The CI provides the flexibility for choosing a significance threshold based on tolerance for false discoveries and precision of the FDR estimate. That is, it frees the investigator to use a more data-driven approach to define significance, such as the minimum estimated FDR, an option that is especially useful for weak effects, often observed in studies of complex diseases.
机译:在通过数千或数百万次测试来满足基因组设置中的参数假设的挑战下,研究人员将强大的错误发现率(FDR)方法与计算量大但精确的置换测试相结合。我们描述了一种基于计算的高效基于置换的方法,该方法包括真实零假设比例的易处理估计量,尾部区域FDR对数的方差和置信区间(CI)估计量,该估计量说明了进行的置换数量以及测试之间的依赖关系。 CI估计器将二项式分布和过度分散参数应用于阳性检验的计数。该方法在测试统计量的分布方面是通用的,与其他方法相比表现良好,并且可靠的FDR估计仅显示了10个排列。该方法将小鼠下丘脑中的睡眠模式与基因表达模式相关联的应用产生了与24小时REM睡眠相关的11个转录物[FDR = 0.15(0.08,0.26)]。相应的两个基因Sfrp1和Sfrp4参与了wnt信号传导,另外几个基因Irf7,Ifit1,Iigp2和Ifih1与干扰素信号传导有联系。如果使用典型的先验FDR阈值(例如0.05或0.1),这些基因将被忽略。 CI为基于错误发现的容忍度和FDR估算的精度提供了选择重要度阈值的灵活性。也就是说,它使研究人员可以使用更多的数据驱动方法来定义重要性,例如最小估计FDR,该选项特别适用于在复杂疾病研究中经常观察到的微弱影响。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号