首页> 外文学位 >False discovery rates: Theory and applications to DNA microarrays.
【24h】

False discovery rates: Theory and applications to DNA microarrays.

机译:错误发现率:理论和对DNA微阵列的应用。

获取原文
获取原文并翻译 | 示例

摘要

Multiple hypothesis testing is concerned with appropriately controlling the rate of false positives when testing several hypotheses simultaneously, while maintaining the power of each test as much as possible. One multiple hypothesis testing error measure is the False Discovery Rate (FDR), which is loosely defined to be the expected proportion of false positives among all significant hypotheses. The FDR is especially appropriate for exploratory analyses in which one is interested in finding many significant results among many tests. In this work, we introduce a modified version of the FDR called the “positive False Discovery Rate” (pFDR). We argue the pFDR is a more appropriate and useful error measure, and we investigate its statistical properties. When assuming the test statistics come from a mixture distribution, we show the pFDR can be written as a posterior probability and can be connected to classification theory. These properties remain asymptotically true under fairly general conditions, even under certain forms of dependence. Also, a new quantity called the “q-value” is introduced and investigated, which is a natural “Bayesian p-value,” or rather the pFDR analogue of the p-value. This idea is also generalized to any multiple hypothesis testing error measure. Using these results, we introduce point estimates of the FDR and pFDR for fixed rejection regions. The point estimates provide proper conservative behavior in the three scenarios of (1) estimating false discovery rates for fixed rejection regions, (2) estimating rejection regions for fixed false discovery rates, and (3) simultaneously estimating false discovery rates over all possible rejection regions—even under certain forms of dependence. It is shown that this new set of methodology extends the current methodology and also provides increases in power. We apply the methodology to the problem of detecting differential gene expression between two or more biological samples based on DNA microarray data. This application is well suited because the dependence between the tests (genes) is weak and the number of tests is quite large.
机译:多重假设测试涉及在同时测试多个假设时适当控制误报率,同时尽可能保持每个测试的功效。一种多重假设检验错误度量是误发现率(FDR),它的定义大致是所有重要假设中假阳性的预期比例。 FDR特别适合于探索性分析,在这种分析中,人们有兴趣在许多测试中发现许多重要的结果。在这项工作中,我们介绍了FDR的修改版本,称为“正误发现率”(pFDR)。我们认为pFDR是一种更合适,更有用的误差度量,并且我们研究了其统计特性。当假设检验统计量来自混合分布时,我们表明pFDR可以写为后验概率,并且可以与分类理论联系起来。在相当普遍的条件下,即使在某些形式的依赖关系下,这些性质也渐近地成立。同样,引入并研究了一个称为“ q值”的新量,它是自然的“贝叶斯p值”,或者说是p值的pFDR类似物。这个想法也可以推广到任何多重假设检验误差度量。利用这些结果,我们介绍了固定拒绝区域的FDR和pFDR的点估计。点估计可在以下三种情况下提供适当的保守行为:(1)估计固定拒绝发现区域的错误发现率;(2)估计固定错误发现率的拒绝区域;(3)同时估计所有可能拒绝区域的错误发现率-即使在某些形式的依赖下。结果表明,这套新的方法论扩展了当前的方法论,并增加了功能。我们将该方法应用于基于DNA芯片数据检测两个或多个生物样品之间差异基因表达的问题。此测试非常适合,因为测试(基因)之间的相关性较弱,并且测试数量非常大。

著录项

  • 作者

    Storey, John David.;

  • 作者单位

    Stanford University.;

  • 授予单位 Stanford University.;
  • 学科 Statistics.; Biology Genetics.; Biology Biostatistics.
  • 学位 Ph.D.
  • 年度 2002
  • 页码 p.1915
  • 总页数 138
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 统计学;
  • 关键词

  • 入库时间 2022-08-17 11:46:20

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号