...
首页> 外文期刊>Journal of the American Medical Informatics Association : >Synthesizing electronic health records using improved generative adversarial networks
【24h】

Synthesizing electronic health records using improved generative adversarial networks

机译:使用改进的生成对抗性网络来合成电子健康记录

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

Objective: The aim of this study was to generate synthetic electronic health records (EHRs). The generated EHR data will be more realistic than those generated using the existing medical Generative Adversarial Network (medGAN) method. Materials and Methods: We modified medGAN to obtain two synthetic data generation models—designated as medical Wasserstein GAN with gradient penalty (medWGAN) and medical boundary-seeking GAN (medBGAN)—and compared the results obtained using the three models. We used 2 databases: MIMIC-III and National Health Insurance Research Database (NHIRD), Taiwan. First, we trained the models and generated synthetic EHRs by using these three 3 models. We then analyzed and compared the models’ performance by using a few statistical methods (Kolmogorov–Smirnov test, dimension-wise probability for binary data, and dimension-wise average count for count data) and 2 machine learning tasks (association rule mining and prediction). Results: We conducted a comprehensive analysis and found our models were adequately efficient for generating synthetic EHR data. The proposed models outperformed medGAN in all cases, and among the 3 models, boundary-seeking GAN (medBGAN) performed the best. Discussion: To generate realistic synthetic EHR data, the proposed models will be effective in the medical industry and related research from the viewpoint of providing better services. Moreover, they will eliminate barriers including limited access to EHR data and thus accelerate research on medical informatics. Conclusion: The proposed models can adequately learn the data distribution of real EHRs and efficiently generate realistic synthetic EHRs. The results show the superiority of our models over the existing model.
机译:目的:本研究的目的是产生合成电子健康记录(EHRS)。生成的EHR数据将比使用现有医学发生的对冲网络(MEDGAN)方法产生的EHR数据更加真实。材料和方法:我们修改了Medgan以获得两个合成数据生成模型 - 指定为医疗Wasserstein Gan,梯度罚款(Medwgan)和医疗领域寻求GaN(Medbgan) - 与使用三种模型获得的结果进行了比较。我们使用了2个数据库:MIMIC-III和国家健康保险研究数据库(NHIRD),台湾。首先,我们通过使用这三种3型号培训了模型和生成的合成EHR。然后我们通过使用少数统计方法(Kolmogorov-Smirnov测试,二进制数据的维度明智概率和计数数据的维度平均计数)和2个机器学习任务(关联规则挖掘和预测的维度 - 方向平均计数)进行分析并进行比较模型的性能。 )。结果:我们进行了全面的分析,发现我们的模型适当高效,用于生成合成EHR数据。拟议的型号在所有情况下都能表现优于Medgan,其中3种型号中,寻求领域GaN(Medbgan)表现最佳。讨论:为了产生现实的合成EHR数据,拟议的模型在提供更好服务的观点来看,拟议的模型将在医学行业和相关研究中有效。此外,它们将消除包括有限访问EHR数据的障碍,从而加速了医学信息学的研究。结论:拟议的模型可以充分了解真实EHR的数据分布,有效地产生现实的合成EHR。结果显示了我们在现有模型上的模型的优越性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号