首页> 外文期刊>Pharmacoepidemiology and drug safety >Imputing missing covariates in time‐to‐event analysis within distributed research networks: A simulation study
【24h】

Imputing missing covariates in time‐to‐event analysis within distributed research networks: A simulation study

机译:在分布式研究网络中的事件发生时间分析中插补缺失协变量:模拟研究

获取原文
获取原文并翻译 | 示例
获取外文期刊封面目录资料

摘要

Abstract Purpose In distributed research network (DRN) settings, multiple imputation cannot be directly implemented because pooling individual‐level data are often not feasible. The performance of multiple imputation in combination with meta‐analysis is not well understood within DRNs. Methods To evaluate the performance of imputation for missing baseline covariate data in combination with meta‐analysis for time‐to‐event analysis within DRNs, we compared two parametric algorithms including one approximated linear imputation model (Approx), and one nonlinear substantive model compatible imputation model (SMC), as well as two non‐parametric machine learning algorithms including random forest (RF), and classification and regression trees (CART), through simulation studies motivated by a real‐world data set. Results Under the setting with small effect sizes (i.e., log‐Hazard ratios logHR) and homogeneous missingness mechanisms across sites, all imputation methods produced unbiased and more efficient estimates while the complete‐case analysis could be biased and inefficient; and under heterogeneous missingness mechanisms, estimates with RF method could have higher efficiency. Estimates from the distributed imputation combined by meta‐analysis were similar to those from the imputation using pooled data. When logHRs were large, the SMC imputation algorithm generally performed better than others. Conclusions These findings suggest the validity and feasibility of imputation within DRNs in the presence of missing covariate data in time‐to‐event analysis under various settings. The performance of the four imputation algorithms varies with the effect sizes and level of missingness.
机译:摘要 目的 在分布式研究网络(DRN)设置中,由于汇集个人层面的数据通常不可行,因此无法直接实现多重插补。多重插补与荟萃分析相结合的性能在 DRN 中尚不被充分理解。 方法 为了评估缺失基线协变量数据的插补性能,结合 DRN 内事件发生时间分析的荟萃分析,我们比较了两种参数算法,包括一种近似线性插补模型 (Approx) 和一种非线性实质模型兼容插补模型 (SMC), 以及两种非参数机器学习算法,包括随机森林 (RF) 和分类和回归树 (CART),通过由真实世界数据集驱动的仿真研究。结果 在效应量小(即对数-风险比[log-Hazardratios, logHR])和各位点均质缺失机制的设置下,所有插补方法都产生了无偏倚且更有效的估计值,而完整病例分析可能存在偏倚和低效;在异构缺失机制下,RF方法的估计效率更高。通过meta分析合并的分布式插补的估计值与使用合并数据的插补值相似。当 logHR 较大时,SMC 插补算法通常比其他算法性能更好。结论 这些发现表明,在各种设置下,在事件发生时间分析中,在存在协变量数据缺失的情况下,在DRNs内进行插补的有效性和可行性。四种插补算法的性能随效应大小和缺失程度而变化。

著录项

获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号