A Density Based Dynamic Data Clustering Algorithm based on Incremental Dataset

Angel Latha Mary S.; K.R. Shankar Kumar

首页> 外文期刊>Journal of computer sciences >A Density Based Dynamic Data Clustering Algorithm based on Incremental Dataset

【24h】

A Density Based Dynamic Data Clustering Algorithm based on Incremental Dataset

机译：基于增量数据集的基于密度的动态数据聚类算法

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Problem statement: Clustering and visualizing high-dimensional dynamic data is a challenging problem. Most of the existing clustering algorithms are based on the static statistical relationship among data. Dynamic clustering is a mechanism to adopt and discover clusters in real time environments. There are many applications such as incremental data mining in data warehousing applications, sensor network, which relies on dynamic data clustering algorithms. Approach: In this work, we present a density based dynamic data clustering algorithm for clustering incremental dataset and compare its performance with full run of normal DBSCAN, Chameleon on the dynamic dataset. Most of the clustering algorithms perform well and will give ideal performance with good accuracy measured with clustering accuracy, which is calculated using the original class labels and the calculated class labels. However, if we measure the performance with a cluster validation metric, then it will give another kind of result. Results: This study addresses the problems of clustering a dynamic dataset in which the data set is increasing in size over time by adding more and more data. So to evaluate the performance of the algorithms, we used Generalized Dunn Index (GDI), Davies-Bouldin index (DB) as the cluster validation metric and as well as time taken for clustering. Conclusion: In this study, we have successfully implemented and evaluated the proposed density based dynamic clustering algorithm. The performance of the algorithm was compared with Chameleon and DBSCAN clustering algorithms. The proposed algorithm performed significantly well in terms of clustering accuracy as well as speed.

机译：问题陈述：高维动态数据的聚类和可视化是一个具有挑战性的问题。现有的大多数聚类算法都是基于数据之间的静态统计关系。动态集群是一种在实时环境中采用和发现集群的机制。有许多应用程序，例如数据仓库应用程序中的增量数据挖掘，传感器网络，它们依赖于动态数据聚类算法。方法：在这项工作中，我们提出了一种基于密度的动态数据聚类算法，用于聚类增量数据集，并将其性能与动态数据集上正常DBSCAN Chameleon的全部性能进行比较。大多数聚类算法性能良好，并且将通过使用原始类别标签和计算出的类别标签计算出的聚类精度，以良好的精度提供理想的性能。但是，如果我们使用集群验证指标来衡量性能，那么它将给出另一种结果。结果：本研究解决了对动态数据集进行聚类的问题，在该数据集中，通过添加越来越多的数据，数据集的大小随时间增加。因此，为了评估算法的性能，我们使用广义邓恩指数（GDI），戴维斯-布尔丁指数（DB）作为聚类验证指标以及聚类所花费的时间。结论：在这项研究中，我们已经成功地实施和评估了所提出的基于密度的动态聚类算法。将算法的性能与Chameleon和DBSCAN聚类算法进行了比较。提出的算法在聚类精度和速度方面都表现出色。

著录项

来源
《Journal of computer sciences》 |2012年第5期|p.656-664|共9页
作者
Angel Latha Mary S.; K.R. Shankar Kumar;
展开▼
作者单位

Department of Information Technology,Information Institute of Engineering, Coimbatore, Tamilnadu, India;

Department of Electronics and Communication Engineering Sri Ramakrishna Engineering College, Coimbatore, Tamilnadu, India;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
clustering; cluster validation; cluster validation metrics;

机译：集群集群验证;集群验证指标;

相似文献

外文文献
中文文献
专利

1. A Density Based Dynamic Data Clustering Algorithm based on Incremental Dataset | Science Publications [J] . K. R.S. Kumar, S. A.L. Mary Journal of computer sciences . 2012,第5期

机译：基于增量数据集的基于密度的动态数据聚类算法科学出版物
2. A Density Based Dynamic Data Clustering Algorithm based on Incremental Dataset | Science Publications [J] . K. R.S. Kumar, S. A.L. Mary Journal of computer sciences . 2012,第5期

机译：基于增量数据集的基于密度的动态数据聚类算法科学出版物
3. Adaptive swarm cluster-based dynamic multi-objective synthetic minority oversampling technique algorithm for tackling binary imbalanced datasets in biomedical data classification [J] . Jinyan Li, Simon Fong, Yunsick Sung, BioData Mining . 2016,第1期

机译：生物医学数据分类中基于二元不平衡数据集的自适应群聚动态多目标合成少数过采样技术算法
4. Batch Incremental Shared Nearest Neighbor Density Based Clustering Algorithm for Dynamic Datasets [C] . Panthadeep Bhattacharjee, Amit Awekar European conference on IR research . 2017

机译：基于批次增量共享最近邻密度的动态数据集聚类算法
5. Supervised precision ordinal clustering – A human-machine learning algorithm to create accurate clusters in big datasets: Application to indiana water quality data with novel visualization techniques [D] . Singh, Sarabjit 2014

机译：有监督的有序序数聚类–一种人机学习算法，可在大型数据集中创建准确的聚类：采用新颖的可视化技术应用于印第安纳州水质数据
6. Adaptive swarm cluster-based dynamic multi-objective synthetic minority oversampling technique algorithm for tackling binary imbalanced datasets in biomedical data classification [O] . Jinyan Li, Simon Fong, Yunsick Sung, 2016

机译：生物医学数据分类中基于二元不平衡数据集的自适应群聚动态多目标综合少数抽样技术算法
7. A Density Based Dynamic Data Clustering Algorithm based on Incremental Dataset [O] . Angel Latha Mary, K. R. Shankar Kumar 2015

机译：基于增量数据集的基于密度的动态数据聚类算法
8. Incremental Model-Based Clustering for Large Datasets With Small Clusters [R] . Fraley, C. , Raftery, A. , Wehrensy, R. 2003

机译：基于增量模型的聚类适用于具有小集群的大型数据集

A Density Based Dynamic Data Clustering Algorithm based on Incremental Dataset

摘要

著录项

相似文献

相关主题

期刊订阅