首页> 外文期刊>Transportation research >Prediction of rare feature combinations in population synthesis: Application of deep generative modelling
【24h】

Prediction of rare feature combinations in population synthesis: Application of deep generative modelling

机译:人口合成中稀有特征组合的预测:深生成型建模的应用

获取原文
获取原文并翻译 | 示例
           

摘要

Population synthesis is concerned with the generation of agents for agent-based modelling in many fields, such as economics, transportation, ecology and epidemiology. When the number of attributes describing the agents and/or their level of detail becomes large, survey data cannot densely support the joint distribution of the attributes in the population due to the curse of dimensionality. It leads to a situation where many attribute combinations are missing from the sample data while such combinations exist in the real population. In this case, it becomes essential to consider methods that are able to impute such missing information effectively. In this paper, we propose to use deep generative latent models. These models are able to learn a compressed representation of the data space, which when projected back to the original space, leads to an effective way of imputing information in the observed data space. Specifically, we employ the Wasserstein Generative Adversarial Network (WGAN) and the Variational Autoencoder (VAE) for a large-scale population synthesis application. The models are applied to a Danish travel survey with a feature-space of more than 60 variables and trained and tested using cross validation. A new metric that applies to the evaluation of generative models in an unsupervised setting is proposed. It is based on the ability to generate diverse yet valid synthetic attribute combinations by comparing if the models can recover missing combinations (sampling zeros) while keeping truly impossible combinations (structural zeros) models at a minimum. For a low dimensional experiment, the VAE, the marginal sampler and the fully random sampler generate 5%, 21% and 26% more structural zeros per sampling zero when compared to the WGAN. For a high dimensional case, these figures increase to 44%, 2217% and 170440% respectively. This research directly supports the development of agent-based systems and in particular cases where detailed socio-economic or geographical representations are required.
机译:人群合成涉及在许多领域的基于代理的建模的代理,例如经济,运输,生态学和流行病学。当描述代理和/或细节水平的属性数量变大时,由于维度诅咒,调查数据不能密集地支持人口中属性的联合分布。它导致示例数据中缺少许多属性组合的情况,而在真实人口中存在这种组合。在这种情况下,需要考虑能够有效地赋予这种缺失信息的方法。在本文中,我们建议使用深生成的潜在模型。这些模型能够学习数据空间的压缩表示,当投影回原始空间时,导致抵御观察到的数据空间中信息的有效方式。具体而言,我们采用Wasserstein生成的对抗性网络(WANG)和变分性AutoEncoder(VAE)进行大规模的人群合成应用。该模型应用于丹麦旅行调查,具有超过60个变量的特征空间,并使用交叉验证培训和测试。提出了一种新的指标,其适用于在无监督设置中的生成模型的评估。它基于能够通过比较模型可以恢复丢失组合(采样零)而生成多样化但有效的合成属性组合的能力,同时保持真正不可能的组合(结构零)模型最小。对于低维实验,VAE,边缘采样器和完全随机采样器在与Wgan相比时产生5%,21%和26%的结构零。对于高尺寸案例,这些数字分别增加到44%,2217%和170440%。这项研究直接支持基于代理的系统的发展,特别是需要详细的社会经济或地理陈述的情况。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号