Range clustering: An algorithm for empirical evaluation of classical clustering algorithms

机译：范围聚类：经典聚类算法的经验评估算法

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Cluster analysis is a principal method in analytics domain of data mining. The algorithm used for clustering directly influences the results obtained from applying the clustering algorithm (clusters). Data clustering is done in order to identify the patterns and trends not identifiable from just looking at the data. Clustering may be supervised (if the machine training data set is available) or unsupervised (if the machine training data set is not available). Unsupervised clustering is usually done using k-Means Algorithm (using any distance, the most common being Euclidean and Manhattan Distance). The drawback of k-means algorithm for a large set are the rigorous calculations that need to be done to cluster a data set into multiple data subsets for every single iteration, thereby limiting its efficiency and use for large data sets. We propose a range based single pass clustering algorithm that clusters data on the basis of the range which it falls in, where the ranges are calculated using simple arithmetic mean between two values. The proposed algorithm is compared against the standard k-means algorithm (using Euclidean Distance and Manhattan Distance).

机译：聚类分析是数据挖掘分析领域中的一种主要方法。用于聚类的算法直接影响从应用聚类算法（聚类）获得的结果。进行数据聚类是为了识别仅通过查看数据便无法识别的模式和趋势。集群可以是受监督的（如果机器训练数据集可用），也可以是不受监督的（如果机器训练数据集不可用）。无监督聚类通常使用k-Means算法（使用任何距离，最常见的是欧几里得距离和曼哈顿距离）来完成。对于大型集合，k-means算法的缺点是需要进行严格的计算，才能针对每个单个迭代将数据集聚类为多个数据子集，从而限制了其效率和对大型数据集的使用。我们提出了一种基于范围的单程聚类算法，该算法根据数据所属的范围对数据进行聚类，其中使用两个值之间的简单算术平均值计算范围。将该算法与标准k均值算法（使用欧几里得距离和曼哈顿距离）进行了比较。

著录项

来源
《International Conference on Contemporary Computing》|2016年|1-4|共4页
会议地点
作者
Nishant Arora; Sandeep Jain; Santosh Kumar Verma;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Clustering algorithms; Algorithm design and analysis; Partitioning algorithms; Machine learning algorithms; Standards; Information technology; Euclidean distance;

机译：聚类算法;算法设计与分析;分区算法;机器学习算法;标准;信息技术;欧氏距离;

相似文献

外文文献
中文文献
专利

1. Air pollution hazard assessment using decision tree algorithms and bivariate probability cluster polar function: evaluating inter-correlation clusters of PM10 and other air pollutants [J] . GIScience & remote sensing . 2020,第1a2期

机译：使用决策树算法和二元概率簇极函数的空气污染危害评估：评估PM10与其他空气污染物的相互关联簇
2. A new validity index for evaluating the clustering results by partitional clustering algorithms [J] . Yue Shihong, Wang Jianpei, Wang Jeenshing, Soft computing: A fusion of foundations, methodologies and applications . 2016,第3期

机译：分区聚类算法评估聚类结果的有效性指标
3. Empirical Analysis of Data Clustering Algorithms [J] . Pranav Nerurkar, Archana Shirke, Madhav Chandane, Procedia Computer Science . 2018,第1期

机译：数据聚类算法的实证分析
4. Range clustering: An algorithm for empirical evaluation of classical clustering algorithms [C] . Nishant Arora, Sandeep Jain, Santosh Kumar Verma International Conference on Contemporary Computing . 2016

机译：范围聚类：经典聚类算法实证评估算法
5. Novel approaches to clustering, biclustering algorithms based on adaptive resonance theory and intelligent control. [D] . Kim, Sejun. 2016

机译：基于自适应共振理论和智能控制的新型聚类，双聚类算法。
6. Knowledge discovery from patients’ behavior via clustering-classification algorithms based on weighted eRFM and CLV model: An empirical study in public health care services [O] . Zeinab Zare Hosseini, Mahdi Mohammadzadeh 2016

机译：通过基于加权eRFM和CLV模型的聚类分类算法从患者行为中发现知识：公共卫生服务的实证研究
7. A separability index for clustering and classification problems with applications to cluster merging and systematic evaluation of clustering algorithms [O] . Peterson, Anna Dagmar 2011

机译：聚类和分类问题的可分离性指标及其在聚类合并和聚类算法的系统评估中的应用

Range clustering: An algorithm for empirical evaluation of classical clustering algorithms

摘要

著录项

相似文献

相关主题

期刊订阅