首页> 外文期刊>Fuzzy sets and systems >A hierarchical clustering algorithm based on fuzzy graph connectedness
【24h】

A hierarchical clustering algorithm based on fuzzy graph connectedness

机译:基于模糊图连通性的层次聚类算法

获取原文
获取原文并翻译 | 示例
       

摘要

Many clustering methods have been proposed in the area of data mining, but only few of them focused on the incremental databases. In this paper, an algorithm for hierarchical clustering based on fuzzy graph connectedness algorithm (FHC) is investigated. The presented algorithm applies fuzzy set theory to hierarchical clustering method so as to discover clusters with arbitrary shape. It first partitions the data sets into several sub-clusters using a partitioning method, and then constructs a fuzzy graph of sub-clusters by analyzing the fuzzy-connectedness degree among sub-clusters. By computing the λ cut graph, the connected components of the fuzzy graph can be obtained, hence resulting the desired clustering. The algorithm can be performed in high-dimensional data sets, finding clusters of arbitrary shapes such as the spherical, linear, elongated or concave ones. Also rendered in this research is the incremental algorithm-IFHC applicable to periodically incremental environments. Not only can FHC and IFHC handle data with numerical attributes, but categorical attributes can be dealt with as well. The results of our experimental study for data sets with arbitrary shape and size are very encouraging. The experimental study in web log files is also conducted that can help discover the user access patterns efficiently. The investigation demonstrates that the proposed method generates better quality clusters than traditional algorithms, and scales up well for large databases.
机译:在数据挖掘领域已经提出了许多聚类方法,但是只有少数方法集中在增量数据库上。本文研究了一种基于模糊图连通性算法的层次聚类算法。该算法将模糊集理论应用于层次聚类方法,以发现任意形状的聚类。它首先使用分区方法将数据集划分为几个子集群,然后通过分析子集群之间的模糊连接度来构造子集群的模糊图。通过计算λ割图,可以获得模糊图的连接分量,从而得到所需的聚类。该算法可以在高维数据集中执行,找到任意形状的簇,例如球形,线性,细长形或凹形。这项研究还提出了适用于周期性增量环境的增量算法IFHC。 FHC和IFHC不仅可以处理具有数字属性的数据,而且还可以处理分类属性。我们对任意形状和大小的数据集进行实验研究的结果令人鼓舞。还对Web日志文件进行了实验研究,可以帮助有效地发现用户访问模式。调查表明,与传统算法相比,所提出的方法可生成质量更好的聚类,并且可以很好地扩展到大型数据库。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号