首页> 外文期刊>BMC Medical Research Methodology >A comparison of multiple imputation methods for handling missing values in longitudinal data in the presence of a time-varying covariate with a non-linear association with time: a simulation study
【24h】

A comparison of multiple imputation methods for handling missing values in longitudinal data in the presence of a time-varying covariate with a non-linear association with time: a simulation study

机译:存在时变协变量且与时间呈非线性关联的情况下处理纵向数据中缺失值的多种插补方法的比较:模拟研究

获取原文
       

摘要

Background Missing data is a common problem in epidemiological studies, and is particularly prominent in longitudinal data, which involve multiple waves of data collection. Traditional multiple imputation (MI) methods (fully conditional specification (FCS) and multivariate normal imputation (MVNI)) treat repeated measurements of the same time-dependent variable as just another ‘distinct’ variable for imputation and therefore do not make the most of the longitudinal structure of the data. Only a few studies have explored extensions to the standard approaches to account for the temporal structure of longitudinal data. One suggestion is the two-fold fully conditional specification (two-fold FCS) algorithm, which restricts the imputation of a time-dependent variable to time blocks where the imputation model includes measurements taken at the specified and adjacent times. To date, no study has investigated the performance of two-fold FCS and standard MI methods for handling missing data in a time-varying covariate with a non-linear trajectory over time – a commonly encountered scenario in epidemiological studies. Methods We simulated 1000 datasets of 5000 individuals based on the Longitudinal Study of Australian Children (LSAC). Three missing data mechanisms: missing completely at random (MCAR), and a weak and a strong missing at random (MAR) scenarios were used to impose missingness on body mass index (BMI) for age z-scores; a continuous time-varying exposure variable with a non-linear trajectory over time. We evaluated the performance of FCS, MVNI, and two-fold FCS for handling up to 50% of missing data when assessing the association between childhood obesity and sleep problems. Results The standard two-fold FCS produced slightly more biased and less precise estimates than FCS and MVNI. We observed slight improvements in bias and precision when using a time window width of two for the two-fold FCS algorithm compared to the standard width of one. Conclusion We recommend the use of FCS or MVNI in a similar longitudinal setting, and when encountering convergence issues due to a large number of time points or variables with missing values, the two-fold FCS with exploration of a suitable time window.
机译:背景技术数据丢失是流行病学研究中的常见问题,在纵向数据中尤为突出,纵向数据涉及多次数据收集。传统的多重插补(MI)方法(完全条件规范(FCS)和多元正态插补(MVNI))将对同一时间相关变量的重复测量视为另一个用于插补的“不同”变量,因此无法充分利用数据的纵向结构。只有少数研究探索了标准方法的扩展,以解释纵向数据的时间结构。一种建议是采用双重完全条件规范(双重FCS)算法,该算法将时间相关变量的插补限制在时间块上,其中插补模型包括在指定时间和相邻时间进行的测量。迄今为止,尚无研究调查两种FCS和标准MI方法在随时间变化的协变量中随时间变化的非线性轨迹处理缺失数据的性能-这是流行病学研究中经常遇到的情况。方法我们根据澳大利亚儿童纵向研究(LSAC)模拟了5000个个体的1000个数据集。三种缺失的数据机制:随机性完全缺失(MCAR),弱性和强性随机缺失(MAR)情景被用于对年龄z得分的体重指数(BMI)施加缺失。连续的随时间变化的具有非线性轨迹的时变曝光变量。当评估儿童肥胖与睡眠问题之间的关联时,我们评估了FCS,MVNI和两倍FCS处理多达50%丢失数据的性能。结果与FCS和MVNI相比,标准的2倍FCS产生的偏差稍大,准确度较低。当将两倍于FCS算法的时间窗口宽度使用为2的时间窗口宽度时,与标准宽度1的偏移相比,我们观察到了偏差和精度的轻微改善。结论我们建议在类似的纵向设置中使用FCS或MVNI,当由于大量时间点或缺少值的变量而遇到收敛问题时,应使用适合时间窗口的两倍FCS。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号