【24h】

A Fully Distributed Clustering Algorithm Based On Fractal Dimension

机译:基于分形维的全分布式聚类算法

获取原文
获取原文并翻译 | 示例
获取外文期刊封面目录资料

摘要

Clustering or grouping of similar objects is one of the most widely used procedures in data mining, which has received enormous attentions and many methods have been proposed in these recent decades. However these traditional clustering algorithms require all the data objects to be located at one single site where it is analyzed. And such limitation cannot face the challenge as nowadays monstrous sizes of data sets are often stored on different independently working computers connected to each other via local or wide area networks instead of one single site. Therefore in this paper, we propose a fully distributed clustering algorithm, called a fully distributed clustering based on fractal dimension (FDCFD), which enables each site to collaborate in forming a global clustering model with low communication cost. The main idea behind FDCFD is via calculating fractal dimension to group points in a cluster in such a way that none of the points in the cluster changes the cluster's fractal dimension radically. In our theoretical analysis, we will demonstrate that our approach can work very well for clustering data that is inherently distributed, collect information spread over several local sites to form a global clustering meanwhile without communication costs and delays for transmitting.
机译:相似对象的聚类或分组是数据挖掘中使用最广泛的过程之一,受到了极大的关注,并且在最近几十年中已经提出了许多方法。但是,这些传统的聚类算法要求所有数据对象都位于一个要分析的单个站点上。而且,这种局限性不能面对挑战,因为如今,数据集的庞大规模通常存储在通过局域网或广域网而不是单个站点相互连接的不同的独立工作的计算机上。因此,在本文中,我们提出了一种完全分布式的聚类算法,称为基于分形维的完全分布式的聚类(FDCFD),它使每个站点可以协作以形成通信成本较低的全局聚类模型。 FDCFD背后的主要思想是通过计算分形维数来对聚类中的点进行分组,以使聚类中的所有点都不会从根本上改变聚类的分形维数。在理论分析中,我们将证明我们的方法可以很好地用于对固有分布的数据进行聚类,收集分布在多个本地站点上的信息以形成全局聚类,同时又不会造成通信成本和传输延迟。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号