首页> 美国卫生研究院文献>Wiley-Blackwell Online Open >Minimum sample size for developing a multivariable prediction model: PART II ‐ binary and time‐to‐event outcomes
【2h】

Minimum sample size for developing a multivariable prediction model: PART II ‐ binary and time‐to‐event outcomes

机译:建立多变量预测模型的最小样本量:第二部分-二值和事件发生时间

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

When designing a study to develop a new prediction model with binary or time‐to‐event outcomes, researchers should ensure their sample size is adequate in terms of the number of participants (n) and outcome events (E) relative to the number of predictor parameters (p) considered for inclusion. We propose that the minimum values of n and E (and subsequently the minimum number of events per predictor parameter, EPP) should be calculated to meet the following three criteria: (i) small optimism in predictor effect estimates as defined by a global shrinkage factor of ≥0.9, (ii) small absolute difference of ≤ 0.05 in the model's apparent and adjusted Nagelkerke's R2, and (iii) precise estimation of the overall risk in the population. Criteria (i) and (ii) aim to reduce overfitting conditional on a chosen p, and require prespecification of the model's anticipated Cox‐Snell R2, which we show can be obtained from previous studies. The values of n and E that meet all three criteria provides the minimum sample size required for model development. Upon application of our approach, a new diagnostic model for Chagas disease requires an EPP of at least 4.8 and a new prognostic model for recurrent venous thromboembolism requires an EPP of at least 23. This reinforces why rules of thumb (eg, 10 EPP) should be avoided. Researchers might additionally ensure the sample size gives precise estimates of key predictor effects; this is especially important when key categorical predictors have few events in some categories, as this may substantially increase the numbers required.
机译:在设计研究以开发具有二元或事件发生时间结果的新预测模型时,研究人员应确保相对于预测变量的数量,参与者(n)和结果事件(E)的样本量足够考虑纳入的参数(p)。我们建议应计算n和E的最小值(以及随后每个预测变量参数的最小事件数,EPP),以满足以下三个标准:(i)由全局收缩因子定义的预测效应估计值较小≥0.9,(ii)模型的表观和调整后的Nagelkerke's R 2 的绝对绝对值≤0.05,以及(iii)总体总体风险的精确估计。准则(i)和(ii)旨在减少在选定p上的过拟合条件,并要求对模型的预期Cox-Snell R 2 进行预规范,我们证明可以从以前的研究中获得。满足所有三个条件的n和E值提供了模型开发所需的最小样本量。应用我们的方法后,新的查加斯病诊断模型的EPP至少应为4.8,而复发性静脉血栓栓塞的新预后模型的EPP至少应为23。这进一步强调了为何应采用经验法则(例如10 EPP)被避免。研究人员还可以确保样本量能准确估计关键预测变量的影响;当关键类别预测变量在某些类别中几乎没有事件时,这尤其重要,因为这可能会大大增加所需的数量。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号