...
首页> 外文期刊>Behavioral Ecology and Sociobiology >Cryptic multiple hypotheses testing in linear models: overestimated effect sizes and the winner's curse
【24h】

Cryptic multiple hypotheses testing in linear models: overestimated effect sizes and the winner's curse

机译:线性模型中的隐式多重假设检验:高估效应大小和获胜者的诅咒

获取原文
获取原文并翻译 | 示例
           

摘要

Fitting generalised linear models (GLMs) with more than one predictor has become the standard method of analysis in evolutionary and behavioural research. Often, GLMs are used for exploratory data analysis, where one starts with a complex full model including interaction terms and then simplifies by removing non-significant terms. While this approach can be useful, it is problematic if significant effects are interpreted as if they arose from a single a priori hypothesis test. This is because model selection involves cryptic multiple hypothesis testing, a fact that has only rarely been acknowledged or quantified. We show that the probability of finding at least one 'significant' effect is high, even if all null hypotheses are true (e.g. 40% when starting with four predictors and their two-way interactions). This probability is close to theoretical expectations when the sample size (N) is large relative to the number of predictors including interactions (k). In contrast, type I error rates strongly exceed even those expectations when model simplification is applied to models that are over-fitted before simplification (low N/k ratio). The increase in false-positive results arises primarily from an overestimation of effect sizes among significant predictors, leading to upward-biased effect sizes that often cannot be reproduced in follow-up studies ('the winner's curse'). Despite having their own problems, full model tests and P value adjustments can be used as a guide to how frequently type I errors arise by sampling variation alone. We favour the presentation of full models, since they best reflect the range of predictors investigated and ensure a balanced representation also of non-significant results.
机译:将具有多个预测因子的广义线性模型(GLM)拟合已成为进化和行为研究的标准分析方法。 GLM通常用于探索性数据分析,其中一个复杂的完整模型(包括交互作用项)开始,然后通过删除不重要的项进行简化。尽管此方法可能有用,但如果将重大影响解释为好像是由单个先验假设检验产生的,则是有问题的。这是因为模型选择涉及秘密的多重假设检验,这一事实很少得到确认或量化。我们表明,即使所有零假设都成立,发现至少一个``重大''影响的可能性很高(例如,从四个预测变量及其双向交互作用开始时为40%)。当样本量(N)相对于包括交互作用(k)的预测变量数较大时,此概率接近理论预期。相反,当将模型简化应用于在简化之前过度拟合的模型(低N / k比)时,I类错误率甚至远远超过预期。假阳性结果的增加主要是由于重要预测因素中对效应大小的过高估计,导致了在后续研究中通常无法重现的效应大小(“获胜者的诅咒”)。尽管有其自身的问题,但是完整的模型测试和P值调整也可以用作指导,以仅通过采样变化来产生I型错误的频率如何。我们赞成使用完整的模型,因为它们能最好地反映所研究的预测变量的范围,并确保非重大结果的均衡表示。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号