首页> 外文期刊>Scandinavian journal of statistics >Implementing Monte Carlo tests with p-value buckets
【24h】

Implementing Monte Carlo tests with p-value buckets

机译:使用P值桶实施Monte Carlo测试

获取原文
获取原文并翻译 | 示例
       

摘要

Software packages usually report the results of statistical tests using p-values. Users often interpret these values by comparing them with standard thresholds, for example, 0.1, 1, and 5%, which is sometimes reinforced by a star rating (***, **, and *, respectively). We consider an arbitrary statistical test whose p-value p is not available explicitly, but can be approximated by Monte Carlo samples, for example, by bootstrap or permutation tests. The standard implementation of such tests usually draws a fixed number of samples to approximate p. However, the probability that the exact and the approximated p-value lie on different sides of a threshold (the resampling risk) can be high, particularly for p-values close to a threshold. We present a method to overcome this. We consider a finite set of user-specified intervals that cover [0, 1] and that can be overlapping. We call these p-value buckets. We present algorithms that, with arbitrarily high probability, return a p-value bucket containing p. We prove that for both a bounded resampling risk and a finite runtime, overlapping buckets need to be employed, and that our methods both bound the resampling risk and guarantee a finite runtime for such overlapping buckets. To interpret decisions with overlapping buckets, we propose an extension of the star rating system. We demonstrate that our methods are suitable for use in standard software, including for low p-value thresholds occurring in multiple testing settings, and that they can be computationally more efficient than standard implementations.
机译:软件包通常使用p值报告统计测试结果。用户通常通过将它们与标准阈值进行比较来解释这些值,例如,0.1,1和5%,有时被星级评级(***,**和*分别)加强。我们考虑一个任意统计测试,其P值P不明确可用,但可以通过Monte Carlo样本来近似,例如,通过引导或排列测试。这种测试的标准实施通常绘制固定数量的样本以近似p。然而,精确和近似p值位于阈值(重采样风险)的不同侧的概率可以很高,特别是对于接近阈值的p值。我们提出了一种克服这一点的方法。我们考虑一个有限的用户指定的间隔,覆盖[0,1],可以重叠。我们称之为这些p值桶。我们提供了任意高概率的算法,返回包含p的p值桶。我们证明,对于有界重采样风险和有限的运行计划,需要采用重叠桶,并且我们的方法都绑定了重采样风险并保证了这种重叠桶的有限运行时。用重叠桶解释决策,我们提出了星级评级系统的延伸。我们证明我们的方法适用于标准软件,包括在多个测试设置中发生的低p值阈值,并且它们可以比标准实现更有效地效率。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号