首页> 外文期刊>Journal of network and computer applications >MuDi-Stream: A multi density clustering algorithm for evolving data stream
【24h】

MuDi-Stream: A multi density clustering algorithm for evolving data stream

机译:MuDi-Stream:用于发展数据流的多密度聚类算法

获取原文
获取原文并翻译 | 示例
           

摘要

Density-based method has emerged as a worthwhile class for clustering data streams. Recently, a number of density-based algorithms have been developed for clustering data streams. However, existing density-based data stream clustering algorithms are not without problem. There is a dramatic decrease in the quality of clustering when there is a range in density of data. In this paper, a new method, called the MuDi-Stream, is developed. It is an online-offline algorithm with four main components. In the online phase, it keeps summary information about evolving multi-density data stream in the form of core mini-clusters. The offline phase generates the final clusters using an adapted density-based clustering algorithm. The grid-based method is used as an outlier buffer to handle both noises and multi-density data and yet is used to reduce the merging time of clustering. The algorithm is evaluated on various synthetic and real-world datasets using different quality metrics and further, scalability results are compared. The experimental results show that the proposed method in this study improves clustering quality in multi-density environments. (C) 2014 Elsevier Ltd. All rights reserved.
机译:基于密度的方法已经成为群集数据流的一种有价值的类。最近,已经开发了许多基于密度的算法来对数据流进行聚类。但是,现有的基于密度的数据流聚类算法并非没有问题。当数据密度在一定范围内时,群集的质量将大大降低。在本文中,开发了一种称为MuDi-Stream的新方法。它是一种具有四个主要组成部分的在线-离线算法。在在线阶段,它以核心小型集群的形式保留有关不断发展的多密度数据流的摘要信息。离线阶段使用适应的基于密度的聚类算法生成最终的聚类。基于网格的方法用作异常值缓冲区,以同时处理噪声和多密度数据,但可用于减少聚类的合并时间。使用不同的质量指标对各种合成数据集和实际数据集进行算法评估,然后对可伸缩性结果进行比较。实验结果表明,该方法在多密度环境下提高了聚类质量。 (C)2014 Elsevier Ltd.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号