...
首页> 外文期刊>AtoZ : Novas Práticas em Informa??o e Conhecimento >A minera??o de dados e a qualidade de conhecimentos extraídos dos boletins de ocorrência das rodovias federais brasileiras
【24h】

A minera??o de dados e a qualidade de conhecimentos extraídos dos boletins de ocorrência das rodovias federais brasileiras

机译:从巴西联邦高速公路警察报告中提取的数据挖掘和知识质量

获取原文
   

获取外文期刊封面封底 >>

       

摘要

Introduction: This paper presents and analyzes the results obtained when applying Data Mining process in the bulletins of occurrences of the Brazilian federal highways generated by the Federal Highway Police (PRF) in 2012. The purpose of this work is to analyze the feasibility of implementing the Data Mining process on data provided by PRF in order to identify associations between variables related to transit accidents in all Brazilian federal highways. Method: It was used symbolic supervised learning algorithms, as well as an algorithm of generation of association rules, implemented in Weka tool. Regarding the database, it was used the records of 2012. On this portion of the database it was conducted the step of data preprocessing, which were used for extracting models and patterns in the Weka tool and, lastly, evaluated the models and extracted patterns. Results: In supervised learning, the results obtained with J48 and PART algorithms have been considered promising due to the fact that for all classes of accidents causes, the values of area under the ROC curve (AUC) were above 0.5. Furthermore, using the Apriori algorithm there have been generated 38 association rules with confidence greater than 0.8. Conclusions: It was concluded that is important to propose a model for data distribution of this database, in order to use it for data mining process, as well as other knowledge extraction tasks and decision making. It was noted still, the need to improve the quality of data to be provided from the initial stage of data gathering, that is, in the very systems used to record the data.
机译:简介:本文介绍并分析了在2012年联邦高速公路警察(PRF)生成的巴西联邦高速公路发生情况公告中应用数据挖掘过程时获得的结果。该工作的目的是分析实施该方法的可行性。对PRF提供的数据进行数据挖掘过程,以识别与巴西所有联邦公路中的交通事故相关的变量之间的关联。方法:使用符号监督学习算法以及在Weka工具中实现的关联规则生成算法。关于数据库,使用了2012年的记录。在数据库的这一部分上,进行了数据预处理的步骤,该步骤用于在Weka工具中提取模型和模式,最后评估模型和提取的模式。结果:在监督学习中,由于对于所有类别的事故原因,ROC曲线下的面积值(AUC)均大于0.5,因此使用J48和PART算法获得的结果被认为很有希望。此外,使用Apriori算法,已经生成了38个置信度大于0.8的关联规则。结论:结论是重要的是,为该数据库的数据分布提出一个模型,以便将其用于数据挖掘过程以及其他知识提取任务和决策。仍然注意到,需要提高从数据收集的初始阶段即在用于记录数据的系统中要提供的数据的质量。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号