首页> 外文期刊>The Aeronautical Journal >Large-scale data analysis on aviation accident database using different data mining techniques
【24h】

Large-scale data analysis on aviation accident database using different data mining techniques

机译:使用不同数据挖掘技术对航空事故数据库进行大规模数据分析

获取原文
获取原文并翻译 | 示例
           

摘要

Data mining is an iterative process in which progress is defined by discovery through either automatic or manual methods. A data cleaning procedure is proposed to improve the quality of classification tasks in the knowledge discovery process by taking into account both redundant and conflicting data. The redundancy check is performed on the original dataset and the resultant dataset is preserved. This resultant dataset is then checked for conflicting data and, if any are found, they are corrected and updated on the original aircraft dataset. This updated dataset is then classified using a variety of classifiers such as Bayes, functions, lazy, MISC, rules and decision trees. The performance of the updated datasets on these classifiers is examine, and the result shows a significant improvement in the classification accuracy after redundancy and conflicts are removed. The conflicts after correction are updated in the original dataset, and when the performance of the classifier is evaluated, great improvement is observed. This paper aims to address how data mining techniques can be used to understand complex system accidents in the aviation domain. Decision trees are considered to be the one of the most powerful and popular approaches in knowledge discovery and data mining. The objective is to develop a classification model for aviation risk investigation and reduction using a decision tree induction method that enhances the ability to form decision trees and thereby proves that the classification accuracy of decision trees is greater. Different feature selectors are used in this study in order to reduce the number of initial attributes.
机译:数据挖掘是一个迭代过程,其中通过自动或手动方法发现来定义进度。通过考虑冗余数据和冲突数据,提出了一种数据清理程序来提高知识发现过程中分类任务的质量。对原始数据集执行冗余检查,并保留结果数据集。然后,检查此结果数据集是否存在冲突数据,如果发现冲突数据,则在原始飞机数据集上对它们进行纠正和更新。然后使用各种分类器(例如贝叶斯,函数,惰性,MISC,规则和决策树)对更新后的数据集进行分类。检查了更新的数据集在这些分类器上的性能,结果显示在消除冗余和冲突后,分类准确性有了显着提高。校正后的冲突会在原始数据集中更新,并且在评估分类器的性能时,可以观察到很大的改进。本文旨在解决如何使用数据挖掘技术来理解航空领域中的复杂系统事故。决策树被认为是知识发现和数据挖掘中最强大,最流行的方法之一。目的是使用决策树归纳方法开发航空风险调查和减少的分类模型,以增强形成决策树的能力,从而证明决策树的分类准确性更高。在这项研究中使用了不同的特征选择器,以减少初始属性的数量。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号