首页> 外文期刊>Computational statistics & data analysis >Investigations into refinements of Storey's method of multiple hypothesis testing minimising the FDR, and its application to test binomial data
【24h】

Investigations into refinements of Storey's method of multiple hypothesis testing minimising the FDR, and its application to test binomial data

机译:最小化FDR的Storey多假设检验方法的改进及其在二项式数据检验中的应用

获取原文
获取原文并翻译 | 示例
       

摘要

Storey's method for multiple hypothesis testing "the Optimal Discovery Procedure" (ODP) minimising the false discovery rate (FDR) and giving p-values and q-values (estimates of FDR) for each test, was extended by iteration to enforce consistency between the p-values of the tests and the binary parameters defining which data points contribute to the fitted null hypothesis. These parameters arise when the null hypothesis has to be estimated from the data. The ODP as previously described, is only optimal for fixed values of these parameters. The extension proposed here requires the introduction of a cut-off parameter for the p-values. Motivated by using this method to analyse a set of pairs of frequencies representing gene expression for a set of genes in two libraries, from which it was desired to select those that are most likely to be not following the null hypothesis that the frequency ratio is a fixed unknown number, this method was tested by analysing many similar simulated datasets. The results showed that the ODP modified by iteration could be improved sometimes greatly by a suitable choice of the cut-off parameter, but varying this parameter alone may not lead to the globally optimal solution because statistical testing based on the binomial distribution is more efficient than using a form of the ODP when the number of non-null hypotheses in the data is small, but the reverse is true when it is large. This may be an effect of using discrete data. Efficiency here is defined in terms of the expected proportion of errors that occur (q-value) when a given proportion of the data is declared "significant" (i.e. the null hypothesis is believed not to hold for them). An improved version of the ODP along these lines is likely to have numerous applications such as in the optimised search for candidate genes that show unusual expression patterns for example when more than two experimental conditions are simultaneously compared and to cases when additional categorical variables or a time series is present in the experimental design.
机译:Storey的用于多种假设测试的方法“最优发现程序”(ODP)使错误发现率(FDR)最小化,并为每个测试提供了p值和q值(FDR的估计值),并通过迭代进行了扩展,以实现两者之间的一致性。测试的p值和定义哪些数据点的二元参数有助于拟合零假设。当必须从数据估计零假设时,会出现这些参数。如前所述,ODP仅对于这些参数的固定值最佳。此处提出的扩展要求为p值引入截止参数。通过使用这种方法来分析代表两个基因库中一组基因的基因表达的一对频率对,希望从中选择最有可能不遵循频率比为0的零假设的那些对。固定未知数,此方法通过分析许多相似的模拟数据集进行了测试。结果表明,通过适当选择截断参数,有时可以大大改善迭代修改的ODP,但仅更改此参数可能不会导致全局最优解,因为基于二项分布的统计检验比之有效。当数据中的非空假设的数量较小时,使用ODP的形式,但当数据数量较大时,反之亦然。这可能是使用离散数据的结果。这里的效率是根据当将给定比例的数据声明为“重要”时(即认为零假设不成立)所发生的错误的预期比例(q值)来定义的。沿着这些思路开发的ODP的改进版本可能具有广泛的应用,例如在优化搜索显示异常表达模式的候选基因时,例如在同时比较两个以上实验条件时以及与其他分类变量或时间相比较的情况下系列出现在实验设计中。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号