首页> 外文期刊>CSI Transactions on ICT >K-modestream algorithm for clustering categorical data streams
【24h】

K-modestream algorithm for clustering categorical data streams

机译:用于分类数据流聚类的K-modestream算法

获取原文
获取原文并翻译 | 示例
           

摘要

Clustering categorical data streams is a challenging problem because new data points are continuously adding to the already existing database at rapid pace and there exists no natural order among the categorical values. Recently, some algorithms have been discussed to tackle the problem of clustering the categorical data streams. However, in all these schemes the user needs to pre-specify the number of clusters, which is not trivial, and it renders to inefficient in the data stream environment. In this paper, we propose a new clustering algorithm, named it as k-modestream, which follows the k-modes algorithm paradigm to dynamically cluster the categorical data streams. It automatically computes the number of clusters and their initial modes simultaneously at regular time intervals. We analyse the time complexity of our scheme and perform various experiments using the synthetic and real world datasets to evaluate its efficacy.
机译:对分类数据流进行聚类是一个具有挑战性的问题,因为新数据点正迅速地不断添加到现有数据库中,并且分类值之间不存在自然顺序。近来,已经讨论了一些算法来解决对分类数据流进行聚类的问题。但是,在所有这些方案中,用户都需要预先指定簇的数量,这并不是不重要的,并且在数据流环境中效率很低。在本文中,我们提出了一种新的聚类算法,称为k-modestream,它遵循k-modes算法范例对分类数据流进行动态聚类。它会自动以规则的时间间隔同时计算群集的数量及其初始模式。我们分析了该方案的时间复杂度,并使用合成和现实世界的数据集进行了各种实验,以评估其有效性。

著录项

  • 来源
    《CSI Transactions on ICT》 |2017年第3期|295-303|共9页
  • 作者

    Ravi Sankar Sangam; Hari Om;

  • 作者单位

    Department of Computer Science and Engineering, Indian Institute of Technology (Indian School of Mines), Dhanbad,Jharkhand 826004, India;

    Department of Computer Science and Engineering, Indian Institute of Technology (Indian School of Mines), Dhanbad,Jharkhand 826004, India;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Data mining; Data streams; Clustering; K-modes;

    机译:数据挖掘;数据流;集群;K模式;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号