Parameter Reduction for Density-based Clustering on Large Data Sets

机译：大数据集上基于密度的聚类的参数约简

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Clustering on large datasets has become one of the most intensively studied areas with increasing data volumes. One of the problems of clustering on large datasets is minimal domain knowledge to determine the input parameters. In the density based clustering, the main input is the minimum neighborhood radius. The problem becomes more difficult when the clusters are in different densities. In this paper, we explore an automatic approach to determine the minimum neighborhood radius based on the distribution of datasets. The algorithm, MINR, is developed to determine the minimum neighborhood radii for different density clusters based on many experiments and observations. MINR can be used together with any density based clustering method to make a nonparametric clustering algorithm. In this paper, we combine MINR with the enhanced DBCSCAN, e-DBCSCAN. Experiments show our approach, is more efficient and scalable than TURN~*.

机译：随着数据量的增加，大型数据集上的聚类已成为研究最深入的领域之一。在大型数据集上进行聚类的问题之一是确定输入参数的领域知识最少。在基于密度的聚类中，主要输入是最小邻域半径。当簇的密度不同时，该问题将变得更加困难。在本文中，我们探索了一种基于数据集分布确定最小邻域半径的自动方法。基于许多实验和观察，开发了MINR算法来确定不同密度簇的最小邻域半径。 MINR可以与任何基于密度的聚类方法一起使用，以构成非参数聚类算法。在本文中，我们将MINR与增强的DBCSCAN，e-DBCSCAN结合在一起。实验表明，我们的方法比TURN〜*更有效，更可扩展。

著录项

来源
《Computer Applications in Industry and Engineering》|2004年|P.181-186|共6页
会议地点
作者
Baoying Wang; William Perrizo;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类计算机的应用;
关键词
data mining; density-based clustering; parameter reduction;

机译：数据挖掘;基于密度的聚类;参数约简;

相似文献

外文文献
中文文献
专利

1. Enhancing density-based clustering: Parameter reduction and outlier detection [J] . Carmelo Cassisi, Alfredo Ferro, Rosalba Giugno, Information Systems . 2013,第3期

机译：增强基于密度的聚类：参数减少和离群值检测
2. Novel density-based and hierarchical density-based clustering algorithms for uncertain data [J] . Zhang Xianchao, Liu Han, Zhang Xiaotong Neural Networks: The Official Journal of the International Neural Network Society . 2017,第期

机译：基于新的基于密度和分层密度的基于分层密度的不确定数据集群算法
3. A fast density-based data stream clustering algorithm with cluster centers self-determined for mixed data [J] . Chen Jin-Yin, He Hui-Hao Information Sciences: An International Journal . 2016,第Null期

机译：针对混合数据自行确定簇中心的基于密度的快速数据流聚类算法
4. A density-based clustering algorithm and experiments on student dataset with noises using Rough set theory [C] . Bidipto Chakraborty, Kunal Chakma, Anjan Mukherjee International Conference on Engineering and Technology . 2016

机译：基于粗糙集的基于密度的聚类算法和带噪声的学生数据集实验
5. The use of clustering analysis and feature extraction for the reduction of very large data sets. [D] . Lathon, Ruby Danyelle. 2000

机译：使用聚类分析和特征提取来减少非常大的数据集。
6. Fast Nonparametric Density-Based Clustering of Large Data Sets Using a Stochastic Approximation Mean-Shift Algorithm [O] . Ollivier Hyrien, Andrea Baran -1

机译：使用随机逼近均值漂移算法的大型数据集基于非参数密度的快速聚类
7. A DENSITY-BASED DATA REDUCTION FOR CLUSTERING ON LARGE DATA SETS [O] . ธรรมศักดิ์ เธียรนิเวศน์ 2548

机译：基于密度的数据减少在大数据集上的聚类

Parameter Reduction for Density-based Clustering on Large Data Sets

摘要

著录项

相似文献

相关主题

期刊订阅