...
【24h】

Distributed clustering of ubiquitous data streams

机译:普遍存在的数据流的分布式集群

获取原文
获取原文并翻译 | 示例
           

摘要

Nowadays information is generated and gathered from distributed streaming data sources, stressing communications and computing infrastructure, making it hard to transmit, compute, and store. Knowledge discovery from ubiquitous data streams has become a major goal for all sorts of applications, mostly based on unsupervised techniques such as clustering. Two subproblems exist: clustering streaming data observations and clustering streaming data sources. The former searches for dense regions of the data space, identifying hot spots where data sources tend to produce data, while the latter finds groups of sources that behave similarly over time. In order to better assess the current status of this topic, this article presents a thorough review on distributed algorithms addressing either of the subproblems. We characterize clustering algorithms for ubiquitous data streams, discussing advantages and disadvantages of distributed procedures. Overall, distributed stream clustering methods improve communication ratios, processing speed, and resources consumption, while achieving similar clustering validity as the centralized counterparts. (C) 2013 John Wiley & Sons, Ltd.
机译:如今,信息是从分布式流数据源生成和收集的,这给通信和计算基础结构带来了压力,使其难以传输,计算和存储。从无处不在的数据流中发现知识已经成为各种应用程序的主要目标,这些应用程序大多基于诸如群集之类的无监督技术。存在两个子问题:对流数据观测进行聚类和对流数据源进行聚类。前者搜索数据空间的密集区域,确定数据源倾向于生成数据的热点,而后者则发现随时间变化表现相似的源组。为了更好地评估此主题的当前状态,本文对解决任一子问题的分布式算法进行了全面的回顾。我们表征了无处不在的数据流的聚类算法,讨论了分布式过程的优缺点。总体而言,分布式流聚类方法提高了通信比率,处理速度和资源消耗,同时实现了与集中式同类方法相似的聚类有效性。 (C)2013 John Wiley&Sons,Ltd.

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号