Statistical analysis of DMV crash data

机译：DMV崩溃数据的统计分析

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

The purpose of this paper is to present statistical methods and models we used to find out factors that caused fatal car crashes and high damage cost. The benefit of our project is that the Virginia DMV can make some adjustments accordingly and reduce the number of crashes that are fatal and have high damage cost. The data we used is between 2010 and 2014 for both fatality analysis and damage cost analysis. Data of 2015 was used for fatality analysis only. In the first part of this paper, we will introduce how we find factors that caused fatal car crashes. Since the data are unbalanced, we first subsampled the non-fatal crashes and applied a higher weight for fatal crashes. When building the model, we used logistic regression model to predict whether an accident is fatal or not. To select features that are more important, we used factors that are all numeric and with correlation value more than 0.1. We obtained a recall of 40% in the prediction from the logistic regression. We also adopted Decision Tree in fatality analysis and built two models for 2010???2014 data as well as 2015 data. In the second part of this paper, we will discuss how we find factors that caused damage cost. Since values of damage cost variable are unbalanced, we proposed a two-state method to find critical factors of the damage cost. First, we used K nearest neighborhood (KNN) to predict whether the damage cost is 0 or not. Second, we built Lasso Regression on the data where the damage cost were not zero and discovered the factors that lead to the damage cost.

机译：本文的目的是呈现我们曾经发现导致致命车祸和高损害成本的因素的统计方法和模型。我们项目的好处是，弗吉尼亚DMV可以相应地进行一些调整，并减少致命的崩溃的数量，并且具有高损坏成本。我们使用的数据是2010年和2014年之间，用于死亡分析和损害成本分析。 2015年的数据仅用于死亡分析。在本文的第一部分，我们将介绍我们如何找到导致致命车辆崩溃的因素。由于数据不平衡，我们首先将非致命撞击撞击并施加更高的致命碰撞。在构建模型时，我们使用了Logistic回归模型来预测事故是致命的。要选择更重要的功能，我们使用了所有数字的因素，并且相关值超过0.1。我们在逻辑回归中的预测中获得了40％的召回。我们还通过了死亡分析中的决策树，并为2010年建立了两个模型2014年数据以及2015年数据。在本文的第二部分，我们将讨论如何发现导致损害成本的因素。由于损坏成本变量的值不平衡，因此我们提出了一种两种方法来寻找损坏成本的关键因素。首先，我们使用K最近的邻居（KNN）来预测损坏成本是0。其次，我们在损坏成本不是零的数据上建立了套索回归，并发现了导致损坏成本的因素。

著录项

来源
《IEEE Systems and Information Engineering Design Symposium》|2016年|113-117|共5页
会议地点
作者
Wenting Tong; Paul Cherian; Jianzhe Liu; Haoyu Li; Quanquan Gu;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
decision analysis; machine learning; statistical modeling;

机译：决策分析;机器学习;统计建模;

相似文献

外文文献
中文文献
专利

1. The statistical analysis of crash-frequency data: A review and assessment of methodological alternatives [J] . Dominique Lord, Fred Mannering Transportation Research . 2010,第5期

机译：事故频率数据的统计分析：方法的回顾和评估
2. Bias properties of Bayesian statistics in finite mixture of negative binomial regression models in crash data analysis [J] . Byung-Jung Park, Dominique Lord, Jeffrey D. Hart Accident Analysis and Prevention . 2010,第2期

机译：碰撞数据分析中负二项式回归模型的有限混合中的贝叶斯统计的偏差性质
3. The statistical analysis of multivariate failure time data: A marginal modeling approach , Ross L. Prentice , Shanshan Zhao , Boca Raton, FL : CRC Press . The statistical analysis of multivariate failure time data: A marginal modeling approach The statistical analysis of multivariate failure time data: A marginal modeling approach , Ross L. Prentice Ross L. Ross L. Prentice Prentice , Shanshan Zhao Shanshan Shanshan Zhao Zhao , Boca Raton, FL Boca Raton, FL : CRC Press CRC Press . [J] . Lin D. Y. Biometrics: Journal of the Biometric Society : An International Society Devoted to the Mathematical and Statistical Aspects of Biology . 2019,第4期

机译：多变量故障时间数据的统计分析：边缘建模方法，罗斯L. Prentice，山山赵，博卡拉顿，FL：CRC压力机。多元故障时间数据的统计分析：边缘建模方法多元故障时间数据的统计分析：边缘建模方法，罗斯L. Prentice Ross L. Ross L. Prentice Prentice，Shanshan Zhao Shanshan Shanshan Zhao Zhao，Boca Raton， FL BOCA RATON，FL：CRC按CRC压力机。
4. Statistical analysis of DMV crash data [C] . Wenting Tong, Paul Cherian, Jianzhe Liu, IEEE Systems and Information Engineering Design Symposium . 2016

机译：DMV崩溃数据的统计分析
5. Integration of GIS and Spatial Statistics—A New Paradigm in Crash Data Analysis [D] . Khan, Ghazan 2012

机译：GIS与空间统计的集成—崩溃数据分析的新范例
6. A systematic review of statistical models and outcomes of predicting fatal and serious injury crashes from driver crash and offense history data [O] . Reneta Slikboer, Samuel D. Muir, S. S. M. Silva, 2020

机译：对司机崩溃和犯罪历史数据进行统计模型和结果的系统审查和预测致命和严重伤害的结果
7. Global convergence and ascent property of a cyclic algorithm used for statistical analysis of crash data [O] . Issa Cherif Geraldo, Assi Ngessan, Kossi Essona Gneyou 2018

机译：崩溃数据统计分析的循环算法的全局收敛性和上升特性
8. National Center for Statistics and Analysis Collected Technical Studies, Volume 2. Accident Data Analysis of Occupant Injuries and Crash Characteristics - Eight Papers [R] . Bondy, N. , Najjar, D. , Partyka, S. 1981

机译：国家统计和分析中心收集的技术研究，第2卷。乘员伤害和碰撞特征的事故数据分析 - 八篇论文

Statistical analysis of DMV crash data

摘要

著录项

相似文献

相关主题

期刊订阅