首页> 外文期刊>BMC Medical Research Methodology >A simulation study of sample size for multilevel logistic regression models
【24h】

A simulation study of sample size for multilevel logistic regression models

机译:多层次Logistic回归模型的样本量模拟研究

获取原文
           

摘要

Background Many studies conducted in health and social sciences collect individual level data as outcome measures. Usually, such data have a hierarchical structure, with patients clustered within physicians, and physicians clustered within practices. Large survey data, including national surveys, have a hierarchical or clustered structure; respondents are naturally clustered in geographical units (e.g., health regions) and may be grouped into smaller units. Outcomes of interest in many fields not only reflect continuous measures, but also binary outcomes such as depression, presence or absence of a disease, and self-reported general health. In the framework of multilevel studies an important problem is calculating an adequate sample size that generates unbiased and accurate estimates. Methods In this paper simulation studies are used to assess the effect of varying sample size at both the individual and group level on the accuracy of the estimates of the parameters and variance components of multilevel logistic regression models. In addition, the influence of prevalence of the outcome and the intra-class correlation coefficient (ICC) is examined. Results The results show that the estimates of the fixed effect parameters are unbiased for 100 groups with group size of 50 or higher. The estimates of the variance covariance components are slightly biased even with 100 groups and group size of 50. The biases for both fixed and random effects are severe for group size of 5. The standard errors for fixed effect parameters are unbiased while for variance covariance components are underestimated. Results suggest that low prevalent events require larger sample sizes with at least a minimum of 100 groups and 50 individuals per group. Conclusion We recommend using a minimum group size of 50 with at least 50 groups to produce valid estimates for multi-level logistic regression models. Group size should be adjusted under conditions where the prevalence of events is low such that the expected number of events in each group should be greater than one.
机译:背景技术在卫生和社会科学领域进行的许多研究都收集个人水平数据作为结果指标。通常,此类数据具有层次结构,其中患者聚集在医生内,而医生聚集在诊所内。大型调查数据(包括国家调查)具有分层或聚类的结构;受访者自然会按地理区域(例如健康区域)进行分组,并且可能会分组为较小的单位。在许多领域中,感兴趣的结果不仅反映了持续的措施,而且还反映了二元结局,例如抑郁,疾病的存在与否以及自我报告的总体健康状况。在多层次研究的框架中,一个重要的问题是计算适当的样本量,以产生无偏且准确的估计。方法在本文中,模拟研究用于评估个体和群体水平上不同样本量对多级logistic回归模型参数和方差成分估计值准确性的影响。此外,研究结果的普遍性和组内相关系数(ICC)的影响。结果结果表明,固定效应参数的估计对于100个组,组大小为50或更大的组没有偏见。方差协方差分量的估计即使在100组且组大小为50的情况下也略有偏差。对于固定大小和5的组大小,固定效应和随机效应的偏差都非常严重。固定效应参数的标准误差是无偏的,而方差协方差分量的标准误差是无偏差的。被低估了。结果表明,低流行事件需要更大的样本量,至少要有100组,每组50个人。结论我们建议使用至少50个组的最小组大小,以至少50个组来生成多级逻辑回归模型的有效估计。应在事件发生率较低的条件下调整组的大小,以使每个组中预期的事件数应大于一个。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号