首页> 外文OA文献 >Generating a close-to-reality synthetic population of Ghana
【2h】

Generating a close-to-reality synthetic population of Ghana

机译:生成接近现实的加纳综合人口

摘要

The purpose of this research is to generate a close-to-reality synthetic human population for use in a geosimulation of urban dynamics. Two commonly acceptedapproaches to generating synthetic human populations are Iterative Proportional Fitting (IPF) and Resampling with Replacement. While these methods are effectiveat reproducing one instance of the probability model describing the survey, it is an instance with extremely small variability amongst subgroups and is very unlikely tobe the real population. IPF and Resampling with Replacement also rely on pure replication of units from the underlying sample which can increase unrealistic model behavior. In this work we present a sequential logic for estimating variables using multinomial logistic regressions and the conditional probabilities amongst each variable in order to generate combinations which were not represented in the original survey but are likely to occur in the real population. We also present a model based approach to imputing missing observation responses and apply the methodology to the Ghana Living Standard Survey 5 (GLSS5) in order to generate a comprehensive synthetic population for the Republic of Ghana, including such household and person variables as household size, tribal aliation, educational attainment and annual income, amongst others. The R language and environment for statistical computingwas used as well as the packages VIM and simPopulation in developing and executing the code. Contingency coefficients, cumulative distributions, mosaic plots, andbox plots are presented for evaluation in order to demonstrate the effectiveness of the new method in its application to Ghana.
机译:这项研究的目的是生成一个接近真实的综合人口,用于城市动力学的地理模拟。生成合成人口的两种普遍接受的方法是迭代比例拟合(IPF)和替换置换。尽管这些方法可以有效地再现描述调查的概率模型的一个实例,但它是一个在子组之间具有极小的变异性的实例,并且不太可能成为实际人口。 IPF和带替换的重采样也依赖于基础样本中单元的纯复制,这会增加不切实际的模型行为。在这项工作中,我们提出了使用多项式逻辑回归和每个变量之间的条件概率来估计变量的顺序逻辑,以便生成在原始调查中未表示但可能在实际人口中出现的组合。我们还提出了一种基于模型的方法来估算缺失的观测响应,并将该方法应用于加纳生活水平调查5(GLSS5),以生成加纳共和国的综合人口,包括家庭和个人变量,例如家庭人数,部落关系,教育程度和年收入等。在开发和执行代码时使用了R语言和统计计算环境,以及VIM和simPopulation软件包。列示了权变系数,累积分布,镶嵌图和箱形图以供评估,以证明该新方法在加纳中的应用效果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号