...
首页> 外文期刊>Accident Analysis & Prevention >Bayes classifiers for imbalanced traffic accidents datasets
【24h】

Bayes classifiers for imbalanced traffic accidents datasets

机译:不平衡交通事故数据集的贝叶斯分类器

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

Traffic accidents data sets are usually imbalanced, where the number of instances classified under the killed or severe injuries class (minority) is much lower than those classified under the slight injuries class (majority). This, however, supposes a challenging problem for classification algorithms and may cause obtaining a model that well cover the slight injuries instances whereas the killed or severe injuries instances are misclassified frequently. Based on traffic accidents data collected on urban and suburban roads in Jordan for three years (2009-2011); three different data balancing techniques were used: under sampling which removes some instances of the majority class, oversampling which creates new instances of the minority class and a mix technique that combines both. In addition, different Bayes classifiers were compared for the different imbalanced and balanced data sets: Averaged One-Dependence Estimators, Weightily Average One-Dependence Estimators, and Bayesian networks in order to identify factors that affect the severity of an accident. The results indicated that using the balanced data sets, especially those created using oversampling techniques, with Bayesian networks improved classifying a traffic accident according to its severity and reduced the misclassification of killed and severe injuries instances. On the other hand, the following variables were found to contribute to the occurrence of a killed causality or a severe injury in a traffic accident: number of vehicles involved, accident pattern, number of directions, accident type, lighting, surface condition, and speed limit. This work, to the knowledge of the authors, is the first that aims at analyzing historical data records for traffic accidents occurring in Jordan and the first to apply balancing techniques to analyze injury severity of traffic accidents. (C) 2015 Elsevier Ltd. All rights reserved.
机译:交通事故数据集通常是不平衡的,归类为遇害或重伤类别(少数)的实例数量比归类为轻伤类别(多数)的实例少得多。但是,这为分类算法提出了一个难题,并可能导致获得一个模型,该模型可以很好地覆盖轻伤实例,而遇难或重伤实例则经常被错误分类。基于三年(2009-2011年)在约旦的城市和郊区道路上收集的交通事故数据;使用了三种不同的数据平衡技术:进行抽样以删除多数类的某些实例;进行过抽样以创建少数类的新实例;以及将两者结合在一起的混合技术。此外,还针对不同的不平衡和平衡数据集比较了不同的贝叶斯分类器:平均一依赖估计量,加权平均一依赖估计量和贝叶斯网络,以识别影响事故严重性的因素。结果表明,通过贝叶斯网络,使用平衡的数据集(尤其是使用过采样技术创建的数据集),可以根据交通事故的严重程度改进对交通事故的分类,并减少死亡和重伤事件的误分类。另一方面,发现以下变量导致交通事故中的死亡原因或严重伤害发生:涉及的车辆数量,事故模式,方向数量,事故类型,照明,表面状况和速度限制。据作者所知,这项工作是第一个旨在分析约旦发生的交通事故的历史数据记录的工作,也是第一个应用平衡技术分析交通事故伤害严重性的工作。 (C)2015 Elsevier Ltd.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号