首页> 外文期刊>Computational statistics & data analysis >Bias reduction in the population size estimation of large data sets
【24h】

Bias reduction in the population size estimation of large data sets

机译:大数据集的人口大小估计偏差

获取原文
获取原文并翻译 | 示例
           

摘要

Estimation of the population size of large data sets and hard to reach populations can be a significant problem. For example, in the military, manpower is limited and the manual processing of large data sets can be time consuming. In addition, accessing the full population of data may be restricted by factors such as cost, time, and safety. Four new population size estimators are proposed, as extensions of existing methods, and their performances are compared in terms of bias with two existing methods in the big data literature. These would be particularly beneficial in the context of time-critical decisions or actions. The comparison is based on a simulation study and the application to five real network data sets (Twitter, LiveJournal, Pokec, Youtube, Wikipedia Talk). Whilst no single estimator (out of the four proposed) generates the most accurate estimates overall, the proposed estimators are shown to produce more accurate population size estimates for small sample sizes, but in some cases show more variability than existing estimators in the literature. (C) 2020 Elsevier B.V. All rights reserved.
机译:估计大数据集的人口大小,难以达到群体可能是一个重要问题。例如,在军队中,人力有限,大数据集的手动处理可能是耗时的。此外,访问完整的数据群体可能受到成本,时间和安全等因素的限制。提出了四个新的人口大小估计,作为现有方法的扩展,并在大数据文献中的两种现有方法的偏差方面进行比较它们的性能。这些在时间关键决定或行动的背景上特别有益。比较基于仿真研究和五个真实网络数据集(Twitter,LiveJournal,Pokec,YouTube,Wikipedia Talk)。虽然没有单个估算器(四个提议的),但总体上产生最准确的估计,所提出的估算器显示出用于小样本尺寸的更准确的人口大小估计,但在某些情况下,在某些情况下显示比文献中的现有估算变得更大的可变性。 (c)2020 Elsevier B.V.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号