【24h】

Survey: Clustering Techniques of Data Stream

机译:调查:数据流的聚类技术

获取原文

摘要

Data stream mining is an emerging specialty in the field of mining huge data that extract useful knowledge of the whole of the data stream. There are several mining processes to handle the data stream. One of the most important and widely used is the clustering data stream. The clustering is either hard (exclusive) or fuzzy (soft) clustering. Recently, significant sources are made available to generate data stream, therefore, the clustering of these data is an important and vital topic for many researchers. Several data stream algorithms have been proposed by researchers during the past years, while some have developed other algorithms.The data stream clustering varies from the traditional clustering in many principles where the main differences between them are explained. Meanwhile, data stream clustering has new challenges such as the single pass on the raw data sets, the unbounded size of this data and the high speed arriving of data samples. But the most prominent one is the dynamic nature of the data.This paper presents a comprehensive study on the hard data stream clustering methods and their algorithms. In addition to the advantages and disadvantages of these methods. Where the paper deals with many aspects that surrounds the data stream, such as stream conditions, challenges, dynamics that it needs, and it changes over time. Then, it presented a transition to the modern trends of clustering algorithms and their utility in online applications. The survey aims to be an auxiliary reference for the researcher in determining the clustering algorithm that compatible with the available data set to achieve the desired goal.
机译:数据流挖掘是挖掘庞大数据领域的新兴专业,该专业是提取整个数据流的有用知识。有几个挖掘过程来处理数据流。最重要和最广泛使用的是群集数据流。群集是硬(独占)或模糊(软)聚类。最近,可以获得重要的来源来生成数据流,因此,这些数据的聚类是许多研究人员的重要和重要的主题。研究人员在过去几年中提出了几个数据流算法,而一些数据流群体已经开发了其他算法。数据流群集在许多原则中从传统聚类变化,其中它们之间的主要差异是解释的。同时,数据流群集具有新的挑战,例如原始数据集的单程,此数据的无限大小和数据样本的高速到达。但最突出的是数据的动态性质。本文对硬数据流聚类方法及其算法呈现了全面的研究。除了这些方法的优缺点。纸质涉及许多方面,围绕数据流,例如流条件,挑战,它需要的动态,而且随着时间的推移而变化。然后,它介绍了在线应用程序中的聚类算法的现代趋势和其实用程序的过渡。该调查旨在成为研究人员确定与可用数据集兼容以实现所需目标的聚类算法的辅助参考。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号