首页> 外文期刊>Journal of Computer and Communications >LeaDen-Stream: A Leader Density-Based Clustering Algorithm over Evolving Data Stream
【24h】

LeaDen-Stream: A Leader Density-Based Clustering Algorithm over Evolving Data Stream

机译:LeaDen-Stream:不断发展的数据流上基于领导者密度的聚类算法

获取原文
           

摘要

Clustering evolving data streams is important to be performed in a limited time with a reasonable quality. The existing micro clustering based methods do not consider the distribution of data points inside the micro cluster. We propose LeaDen-Stream (Leader Density-based clustering algorithm over evolving data Stream), a density-based clustering algorithm using leader clustering. The algorithm is based on a two-phase clustering. The online phase selects the proper mini-micro or micro-cluster leaders based on the distribution of data points in the micro clusters. Then, the leader centers are sent to the offline phase to form final clusters. In LeaDen-Stream, by carefully choosing between two kinds of micro leaders, we decrease time complexity of the clustering while maintaining the cluster quality. A pruning strategy is also used to filter out real data from noise by introducing dense and sparse mini-micro and micro-cluster leaders. Our performance study over a number of real and synthetic data sets demonstrates the effectiveness and efficiency of our method.
机译:群集不断发展的数据流对于在有限的时间内以合理的质量执行非常重要。现有的基于微集群的方法没有考虑微集群内部数据点的分布。我们提出了LeaDen-Stream(在不断发展的数据流上基于Leader Density的聚类算法),这是一种使用Leader聚类的基于密度的聚类算法。该算法基于两阶段聚类。在线阶段根据微集群中数据点的分布选择适当的微型或微型集群领导者。然后,领导者中心被发送到离线阶段以形成最终集群。在LeaDen-Stream中,通过在两种微型领导者之间进行仔细选择,我们可以在保持集群质量的同时降低集群的时间复杂度。通过引入密集而稀疏的微型和微型集群领导者,还可以使用修剪策略从噪声中过滤掉真实数据。我们对大量真实和综合数据集的性能研究证明了我们方法的有效性和效率。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号