首页> 外文会议>Pacific Rim international conference on artificial intelligence >A Correlation Based Imputation Method for Incomplete Traffic Accident Data
【24h】

A Correlation Based Imputation Method for Incomplete Traffic Accident Data

机译:不完全交通事故数据的基于相关的插补方法

获取原文

摘要

Death, injury and disability from road traffic crashes continue to be a major global public health problem. Recent data suggest that the number of fatalities from traffic crashes is in excess of 1.25 million people each year with non-fatal injuries affecting a further 20-50 million people. It is predicted that by 2030 road traffic accidents will have progressed to be the 5th leading cause of death and that the number of people who will die annually from traffic accidents will have doubled from current levels. Therefore, methods to reduce accident severity are of great interest to traffic agencies and the public at large. Road accident fatality rate depends on many factors and it is a very challenging task to investigate the dependencies between the attributes because of the many environmental and road accident factors. Any missing data in the database could obscure the discovery of important factors and lead to invalid conclusions. In order to make the traffic accident datasets useful for analysis, it should be preprocessed properly. In this paper, we present a novel method based on sampling of distributions obtained from correlation measures for the imputation of missing values to improve the quality of the traffic accident data We evaluated our algorithm using two publicly available traffic accident databases of United States (explore.data.gov, da-ta.opencolorado.org). Our results indicate that the proposed method performs significantly better than the three existing algorithms.
机译:道路交通事故造成的死亡,伤害和残疾仍然是全球主要的公共卫生问题。最新数据表明,交通事故造成的死亡人数每年超过125万人,非致命伤害又影响了2亿5千万人。预计到2030年,道路交通事故将成为第五大死亡原因,每年因交通事故死亡的人数将从目前的水平增加一倍。因此,降低交通事故严重性的方法对交通部门和广大公众都非常感兴趣。道路交通事故死亡率取决于许多因素,由于许多环境和道路交通事故因素,调查属性之间的依赖关系是一项非常艰巨的任务。数据库中任何丢失的数据都可能掩盖重要因素的发现,并导致无效的结论。为了使交通事故数据集对分析有用,应该对其进行适当的预处理。在本文中,我们提出了一种基于从相关度量中获得的分布进行抽样的新方法,以估算缺失值以提高交通事故数据的质量。我们使用了美国两个可公开获得的交通事故数据库对我们的算法进行了评估(探索。 data.gov,da-ta.opencolorado.org)。我们的结果表明,所提出的方法的性能明显优于现有的三种算法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号