【24h】

An Efficient Outlier Mining Algorithm for Large Dataset

机译:大数据集的一种有效的离群值挖掘算法

获取原文

摘要

Since an outlier often contains useful information, outlier detection is becoming a hot issue in data mining. Thus, an efficient outlier mining algorithm based on KNN is proposed in this paper. It can find outlier more accurately through defining a correlation matrix considering the importance and correlation between attributes. In addition, a data structure R-tree is used in the algorithm and it utilizes pruning scheme to drastically reduce the time consuming of computing. Experimental results show that our algorithm is more efficient than the traditional KNN algorithm. It will provide an effective solution for outlier mining in large dataset.
机译:由于异常值通常包含有用的信息,因此异常值检测已成为数据挖掘中的热门问题。因此,本文提出了一种基于KNN的高效离群挖掘算法。通过定义考虑属性之间重要性和相关性的相关矩阵,可以更精确地找到异常值。另外,该算法中使用了数据结构R树,并利用修剪方案大大减少了计算时间。实验结果表明,该算法比传统的KNN算法更有效。这将为大型数据集中的异常挖掘提供有效的解决方案。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号