Bayes classifiers for imbalanced traffic accidents datasets

Mujalli Randa Oqab; Lopez Griselda; Garach Laura

首页> 外文期刊>Accident Analysis & Prevention >Bayes classifiers for imbalanced traffic accidents datasets

【24h】

Bayes classifiers for imbalanced traffic accidents datasets

机译：不平衡交通事故数据集的贝叶斯分类器

获取原文

获取原文并翻译 | 示例

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Traffic accidents data sets are usually imbalanced, where the number of instances classified under the killed or severe injuries class (minority) is much lower than those classified under the slight injuries class (majority). This, however, supposes a challenging problem for classification algorithms and may cause obtaining a model that well cover the slight injuries instances whereas the killed or severe injuries instances are misclassified frequently. Based on traffic accidents data collected on urban and suburban roads in Jordan for three years (2009-2011); three different data balancing techniques were used: under sampling which removes some instances of the majority class, oversampling which creates new instances of the minority class and a mix technique that combines both. In addition, different Bayes classifiers were compared for the different imbalanced and balanced data sets: Averaged One-Dependence Estimators, Weightily Average One-Dependence Estimators, and Bayesian networks in order to identify factors that affect the severity of an accident. The results indicated that using the balanced data sets, especially those created using oversampling techniques, with Bayesian networks improved classifying a traffic accident according to its severity and reduced the misclassification of killed and severe injuries instances. On the other hand, the following variables were found to contribute to the occurrence of a killed causality or a severe injury in a traffic accident: number of vehicles involved, accident pattern, number of directions, accident type, lighting, surface condition, and speed limit. This work, to the knowledge of the authors, is the first that aims at analyzing historical data records for traffic accidents occurring in Jordan and the first to apply balancing techniques to analyze injury severity of traffic accidents. (C) 2015 Elsevier Ltd. All rights reserved.

机译：交通事故数据集通常是不平衡的，归类为遇害或重伤类别（少数）的实例数量比归类为轻伤类别（多数）的实例少得多。但是，这为分类算法提出了一个难题，并可能导致获得一个模型，该模型可以很好地覆盖轻伤实例，而遇难或重伤实例则经常被错误分类。基于三年（2009-2011年）在约旦的城市和郊区道路上收集的交通事故数据；使用了三种不同的数据平衡技术：进行抽样以删除多数类的某些实例；进行过抽样以创建少数类的新实例；以及将两者结合在一起的混合技术。此外，还针对不同的不平衡和平衡数据集比较了不同的贝叶斯分类器：平均一依赖估计量，加权平均一依赖估计量和贝叶斯网络，以识别影响事故严重性的因素。结果表明，通过贝叶斯网络，使用平衡的数据集（尤其是使用过采样技术创建的数据集），可以根据交通事故的严重程度改进对交通事故的分类，并减少死亡和重伤事件的误分类。另一方面，发现以下变量导致交通事故中的死亡原因或严重伤害发生：涉及的车辆数量，事故模式，方向数量，事故类型，照明，表面状况和速度限制。据作者所知，这项工作是第一个旨在分析约旦发生的交通事故的历史数据记录的工作，也是第一个应用平衡技术分析交通事故伤害严重性的工作。（C）2015 Elsevier Ltd.保留所有权利。

著录项

来源
《Accident Analysis & Prevention》 |2016年第3期|37-51|共15页
作者
Mujalli Randa Oqab; Lopez Griselda; Garach Laura;
展开▼
作者单位

Hashemite Univ, Dept Civil Engn, Zarqa 13115, Jordan;

Univ Granada, Dept Civil Engn, ETSI Caminos Canales & Puertos, C Severo Ochoa S-N, E-18071 Granada, Spain;

Univ Granada, Dept Civil Engn, ETSI Caminos Canales & Puertos, C Severo Ochoa S-N, E-18071 Granada, Spain;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Bayesian networks; Traffic accidents; Urban area; Imbalanced data set; SMOTE;

机译：贝叶斯网络;交通事故;城市地区;数据集不平衡;SMOTE;

相似文献

外文文献
中文文献
专利

1. Sampling imbalance dataset for software defect prediction using hybrid neuro-fuzzy systems with Naive Bayes classifier [J] . Punitha K., Latha B. Technical Gazette . 2016,第6期

机译：使用带有朴素贝叶斯分类器的混合神经模糊系统进行软件缺陷预测的采样不平衡数据集
2. Sentiment Analysis of Review Datasets Using Na?ve Bayes' and K-NN Classifier [J] . Lopamudra Dey, Sanjay Chakraborty, Anuraag Biswas, International Journal of Information Engineering and Electronic Business . 2016,第4期

机译：使用朴素贝叶斯和K-NN分类器对评论数据集进行情感分析
3. Knowledge discovery in medical and biological datasets using a hybrid Bayes classifier/evolutionary algorithm [J] . Raymer M.L., Doom T.E., Kuhn L.A., IEEE transactions on systems, man, and cybernetics. Part B, Cybernetics . 2003,第5期

机译：使用混合贝叶斯分类器/进化算法的医学和生物数据集中的知识发现
4. A first approach towards the usage of classifiers’ performance to create fuzzy measures for ensembles of classifiers: a case study on highly imbalanced datasets [C] . M. Uriz, D. Paternain, H. Bustince, IEEE International Conference on Fuzzy Systems . 2018

机译：利用分类器性能为分类器集合创建模糊度量的第一种方法：以高度不平衡的数据集为例
5. Classifier design to improve pattern classification and knowledge discovery for imbalanced datasets. [D] . Wang, Kun. 2009

机译：分类器设计可改进模式分类和不平衡数据集的知识发现。
6. Brief Report: Gender imbalance in medical imaging datasets produces biased classifiers for computer-aided diagnosis [O] . Agostina J. Larrazabal, Nicolás Nieto, Victoria Peterson, 2020

机译：简要报告：医学影像数据集中的性别不平衡产生了用于计算机辅助诊断的有偏分类器
7. Sampling imbalance dataset for software defect prediction using hybrid neuro-fuzzy systems with Naive Bayes classifier [O] . 2016

机译：使用朴实贝叶斯分类器的混合神经模糊系统进行软件缺陷预测的采样不平衡数据集

Bayes classifiers for imbalanced traffic accidents datasets

摘要

著录项

相似文献

相关主题

期刊订阅