首页> 美国卫生研究院文献>International Journal of Environmental Research and Public Health >Risky Driver Recognition with Class Imbalance Data and Automated Machine Learning Framework
【2h】

Risky Driver Recognition with Class Imbalance Data and Automated Machine Learning Framework

机译:危险的驱动程序识别与类不平衡数据和自动化机器学习框架

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Identifying high-risk drivers before an accident happens is necessary for traffic accident control and prevention. Due to the class-imbalance nature of driving data, high-risk samples as the minority class are usually ill-treated by standard classification algorithms. Instead of applying preset sampling or cost-sensitive learning, this paper proposes a novel automated machine learning framework that simultaneously and automatically searches for the optimal sampling, cost-sensitive loss function, and probability calibration to handle class-imbalance problem in recognition of risky drivers. The hyperparameters that control sampling ratio and class weight, along with other hyperparameters, are optimized by Bayesian optimization. To demonstrate the performance of the proposed automated learning framework, we establish a risky driver recognition model as a case study, using video-extracted vehicle trajectory data of 2427 private cars on a German highway. Based on rear-end collision risk evaluation, only 4.29% of all drivers are labeled as risky drivers. The inputs of the recognition model are the discrete Fourier transform coefficients of target vehicle’s longitudinal speed, lateral speed, and the gap between the target vehicle and its preceding vehicle. Among 12 sampling methods, 2 cost-sensitive loss functions, and 2 probability calibration methods, the result of automated machine learning is consistent with manual searching but much more computation-efficient. We find that the combination of Support Vector Machine-based Synthetic Minority Oversampling TEchnique (SVMSMOTE) sampling, cost-sensitive cross-entropy loss function, and isotonic regression can significantly improve the recognition ability and reduce the error of predicted probability.
机译:确定高风险司机事故发生之前是必要的交通事故预防和控制。由于驱动数据的类不平衡性质,高风险样本作为少数类是由标准分类算法通常虐待。不是施加预设采样或成本敏感的学习的,提出了一种新颖的自动机器学习的框架,同时且自动地搜索最优的采样,对成本敏感的损失函数,和概率校准到手柄类不平衡问题在识别高风险的驱动程序。所述超参数对照采样比和类重量,与其它超参数一起,通过贝叶斯优化优化。为了证明所提出的自动学习框架的性能,我们建立了一个危险的驾驶员识别模型作为研究案例,利用在德国的高速公路2427台私家车的视频提取的车辆轨迹数据。基于追尾风险评估,只有4.29%的所有驱动程序被标记为高风险的驱动程序。识别模型的输入是离散傅立叶变换目标车辆的纵向速度,横向速度和目标车辆和其前行车辆之间的间隙的系数。间12的采样方法,2成本敏感的损失函数,和2种概率校准方法,自动化机器学习的结果为具有手动搜索一致的,但更计算效率。我们发现,基于机器的支持向量合成少数类过采样技术(SVMSMOTE)取样,对成本敏感的交叉熵损失函数,和等渗回归的组合可以显著提高识别能力,减少预测概率的错误。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号