首页> 外文期刊>Paediatric and perinatal epidemiology >Comparing methods of analysing datasets with small clusters: case studies using four paediatric datasets.
【24h】

Comparing methods of analysing datasets with small clusters: case studies using four paediatric datasets.

机译:比较具有小聚类的数据集的分析方法:使用四个儿科数据集的案例研究。

获取原文
获取原文并翻译 | 示例
       

摘要

Studies of prematurely born infants contain a relatively large percentage of multiple births, so the resulting data have a hierarchical structure with small clusters of size 1, 2 or 3. Ignoring the clustering may lead to incorrect inferences. The aim of this study was to compare statistical methods which can be used to analyse such data: generalised estimating equations, multilevel models, multiple linear regression and logistic regression. Four datasets which differed in total size and in percentage of multiple births (n = 254, multiple 18%; n = 176, multiple 9%; n = 10 098, multiple 3%; n = 1585, multiple 8%) were analysed. With the continuous outcome, two-level models produced similar results in the larger dataset, while generalised least squares multilevel modelling (ML GLS 'xtreg' in Stata) and maximum likelihood multilevel modelling (ML MLE 'xtmixed' in Stata) produced divergent estimates using the smaller dataset. For the dichotomous outcome, most methods, except generalised least squares multilevel modelling (ML GH 'xtlogit' in Stata) gave similar odds ratios and 95% confidence intervals within datasets. For the continuous outcome, our results suggest using multilevel modelling. We conclude that generalised least squares multilevel modelling (ML GLS 'xtreg' in Stata) and maximum likelihood multilevel modelling (ML MLE 'xtmixed' in Stata) should be used with caution when the dataset is small. Where the outcome is dichotomous and there is a relatively large percentage of non-independent data, it is recommended that these are accounted for in analyses using logistic regression with adjusted standard errors or multilevel modelling. If, however, the dataset has a small percentage of clusters greater than size 1 (e.g. a population dataset of children where there are few multiples) there appears to be less need to adjust for clustering.
机译:对早产婴儿的研究包含较大比例的多胞胎,因此所得数据具有层次结构,具有大小为1、2或3的小聚类。忽略聚类可能会导致错误的推断。本研究的目的是比较可用于分析此类数据的统计方法:广义估计方程,多层模型,多元线性回归和逻辑回归。分析了四个数据集,这些数据的总大小和多胎出生百分比不同(n = 254,多胎18%; n = 176,多胎9%; n = 10098,多胎3%; n = 1585,多胎8%)。有了连续的结果,两级模型在更大的数据集中产生了相似的结果,而广义最小二乘多级建模(Stata中为ML GLS'xtreg')和最大似然多级建模(Stata中为ML MLE'xtmixed')产生了使用较小的数据集。对于二分式结果,除广义最小二乘多级建模(Stata中的ML GH'xtlogit')外,大多数方法在数据集中均具有相似的优势比和95%的置信区间。对于连续结果,我们的结果建议使用多级建模。我们得出结论,当数据集较小时,应谨慎使用广义最小二乘多层模型(Stata中为ML GLS'xtreg')和最大似然多层模型(Stata中为ML MLE'xtmixed')。如果结果是二分法,并且非独立数据的比例相对较高,建议在使用逻辑回归和调整后的标准误差或多级建模的分析中考虑这些数据。但是,如果数据集的聚类比例大于大小1的比例很小(例如,倍数很少的儿童的人口数据集),则似乎不需要调整聚类。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号