首页> 外文期刊>Urban Science >Correcting Bias in Crowdsourced Data to Map Bicycle Ridership of All Bicyclists
【24h】

Correcting Bias in Crowdsourced Data to Map Bicycle Ridership of All Bicyclists

机译:纠正众包数据中的偏见,以映射所有骑自行车的自行车骑行

获取原文
           

摘要

Traditional methods of counting bicyclists are resource-intensive and generate data with sparse spatial and temporal detail. Previous research suggests big data from crowdsourced fitness apps offer a new source of bicycling data with high spatial and temporal resolution. However, crowdsourced bicycling data are biased as they oversample recreational riders. Our goals are to quantify geographical variables, which can help in correcting bias in crowdsourced, data and to develop a generalized method to correct bias in big crowdsourced data on bicycle ridership in different settings in order to generate maps for cities representative of all bicyclists at a street-level spatial resolution. We used street-level ridership data for 2016 from a crowdsourced fitness app (Strava), geographical covariate data, and official counts from 44 locations across Maricopa County, Arizona, USA (training data); and 60 locations from the city of Tempe, within Maricopa (test data). First, we quantified the relationship between Strava and official ridership data volumes. Second, we used a multi-step approach with variable selection using LASSO followed by Poisson regression to integrate geographical covariates, Strava, and training data to correct bias. Finally, we predicted bias-corrected average annual daily bicyclist counts for Tempe and evaluated the models accuracy using the test data. We found a correlation between the annual ridership data from Strava and official counts (R2 = 0.76) in Maricopa County for 2016. The significant variables for correcting bias were: The proportion of white population, median household income, traffic speed, distance to residential areas, and distance to green spaces. The model could correct bias in crowdsourced data from Strava in Tempe with 86% of road segments being predicted within a margin of 100 average annual bicyclists. Our results indicate that it is possible to map ridership for cities at the street-level by correcting bias in crowdsourced bicycle ridership data, with access to adequate data from official count programs and geographical covariates at a comparable spatial and temporal resolution.
机译:传统的计数骑自行车师的方法是资源密集的,并产生具有稀疏空间和时间细节的数据。以前的研究表明来自众包的健身应用的大数据提供了具有高空间和时间分辨率的新增自行车数据来源。然而,众包骑自行车数据被偏见,因为它们过度娱乐骑手。我们的目标是量化地理变量,可以帮助纠正众包,数据的偏见,并在不同的环境中制定纠正大型众包数据中的偏见,以便为所有骑自行车的人产生代表所有骑自行车的城市的地图街道级空间分辨率。我们在2016年,从众群健身应用程序(Strava),地理协变量数据以及来自美国亚利桑那州亚利桑那州的44个地点的官方计数(训练数据),从众群健身应用程序(Strava),地理协变量数据以及官方计数;和坦佩市的60个地方,在Maricopa(测试数据)中。首先,我们量化了Strava与官方乘坐数据卷之间的关系。其次,我们使用了使用洛杉矶的多步方法,然后使用泊松回归来集成地理协变量,strava和培训数据来纠正偏见。最后,我们预测了差异校正的平均每日骑自行车的骑自行车的人数,并使用测试数据评估模型精度。我们在2016年的Maricopa County的年度乘积数据与官方计数(R2 = 0.76)之间的相关性。纠正偏差的重要变量是:白人人口,中位数家庭收入,交通速度,与住宅区的距离的比例和与绿色空间的距离。该模型可以在坦皮中从斯特拉瓦群体纠正群体中的群体,其中86%的道路段在100年平均每年骑自行车的人的边缘内预测。我们的结果表明,通过纠正众群自行车乘坐数据的偏差,可以获得官方计数计划和地理协变量,以可比的空间和时间分辨率来绘制街道级别的城市骑行。

著录项

获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号