首页> 外文会议>International Conference on Web Research >SW-DBSCAN: A Grid-based DBSCAN Algorithm for Large Datasets
【24h】

SW-DBSCAN: A Grid-based DBSCAN Algorithm for Large Datasets

机译:SW-DBSCAN:大型数据集的基于网格的DBSCAN算法

获取原文

摘要

Data clustering aims to discover the underlying structure of data. it has many applications in data analysis and it is one of the most widely used tools in data mining. DBSCAN is one of the most famous clustering algorithms. its advantages are to identify clusters of various shapes and define the number of clusters. Since DBSCAN is sensitive to its parameters which are ε and MinPts, it may perform poorly when the dataset is unbalanced. To solve this problem, this paper proposes a sliding window DBSCAN clustering algorithm that uses Gridding and local parameters for unbalanced data which we will refer to as SW-DBSCAN. The algorithm divides the dataset into several grids. The size and shape of each gird depends on the specimen density specification. Then, for each grid, the parameters are adjusted for local clustering and eventually merging data zones. Experimental results show that this algorithm can help to improve the performance of the DBSCAN algorithm and can deal with arbitrary data and asymmetric data.
机译:数据聚类旨在发现数据的基础结构。它在数据分析中有许多应用程序,它是数据挖掘中最广泛使用的工具之一。 DBSCAN是最着名的聚类算法之一。其优点是识别各种形状的簇,并定义簇的数量。由于DBSCAN对其参数敏感,因此当数据集不平衡时,它可能会表现不佳。为了解决这个问题,本文提出了一种滑动窗口DBSCAN聚类算法,它使用网格和本地参数来进行不平衡数据,我们将参考SW-DBSCAN。该算法将数据集划分为多个网格。每个曲线的尺寸和形状取决于样品密度规格。然后,对于每个网格,针对本地聚类和最终合并数据区域来调整参数。实验结果表明,该算法可以有助于提高DBSCAN算法的性能,并可以处理任意数据和非对称数据。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号