...
首页> 外文期刊>Nature Communications >Multiple imputation for analysis of incomplete data in distributed health data networks
【24h】

Multiple imputation for analysis of incomplete data in distributed health data networks

机译:分布式健康数据网络中的不完整数据分析多重估算

获取原文
           

摘要

Distributed health data networks (DHDNs) leverage data from multiple sources or sites such as electronic health records (EHRs) from multiple healthcare systems and have drawn increasing interests in recent years, as they do not require sharing of subject-level data and hence lower the hurdles for collaboration between institutions considerably. However, DHDNs face a number of challenges in data analysis, particularly in the presence of missing data. The current state-of-the-art methods for handling incomplete data require pooling data into a central repository before analysis, which is not feasible in DHDNs. In this paper, we address the missing data problem in distributed environments such as DHDNs that has not been investigated previously. We develop communication-efficient distributed multiple imputation methods for incomplete data that are horizontally partitioned. Since subject-level data are not shared or transferred outside of each site in the proposed methods, they enhance protection of patient privacy and have the potential to strengthen public trust in analysis of sensitive health data. We investigate, through extensive simulation studies, the performance of these methods. Our methods are applied to the analysis of an acute stroke dataset collected from multiple hospitals, mimicking a DHDN where health data are horizontally partitioned across hospitals and subject-level data cannot be shared or sent to a central data repository. Distributed health data networks (DHDNs) leverage data from multiple healthcare systems, but often face major analytical challenges in the presence of missing data. This paper develops distributed multiple imputation methods that do not require sharing subject-level data across health systems.
机译:分布式健康数据网络(DHDN)利用来自多个医疗保健系统(例如电子健康记录(EHR)等多种来源或网站的数据,并在近年来吸引了越来越多的利益,因为它们不需要共享主题级数据并因此降低障碍用于机构之间的合作。然而,DHDNS在数据分析中面临着许多挑战,特别是在存在缺失数据的情况下。用于处理不完整数据的当前最先进的方法需要将数据汇集到分析之前的中央存储库中,这在DHDN中不可行。在本文中,我们解决了分布式环境中缺失的数据问题,例如尚未先前调查的DHDN。我们开发用于水平分区的不完整数据的通信有效的分布式多重估算方法。由于在所提出的方法中未在每个站点以外的情况下共享或转移主题数据,因此他们加强对患者隐私的保护,并有可能加强对敏感健康数据分析的公众信任。我们通过广泛的仿真研究来调查这些方法的性能。我们的方法适用于从多个医院收集的急性行程数据集的分析,模仿DHDN,其中健康数据跨医院水平分区,并且无法共享或发送到中央数据存储库。分布式健康数据网络(DHDN)利用来自多个医疗保健系统的数据,但在存在缺失数据的情况下通常会面临主要的分析挑战。本文开发了不需要在卫生系统上共享主题级数据的分布式多重估算方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号